I came across code in SQLite (this article applies to Postgres too)
CASE WHEN name < 'c' THEN 'combined'
ElSE name
END as name
where the column alias (name
after END AS) of a CASE statement used the same alias name as the raw input column name (name
after WHEN), leading to possibly confusing results for beginners. This article explains what is happening in such cases with a short experiment at http://sqlfiddle.com/#!7/2676c/3.
If trying your own tables, write the DML SQL and click “Build Schema” first before running queries with “Run SQL”.
| name | value | |------|-------| | a…
I was working on this leetcode question https://leetcode.com/contest/weekly-contest-212/problems/path-with-minimum-effort/ using backtracking and spent some time debugging strange output. This article discusses some pitfalls when using recursion with global variables, how to handle them, and how to change the code from global to local.
Disclaimer: Backtracking only passing 15/75 test cases and Time Limit Exceeded for the rest, the purpose of this article is to highlight possible issues with global variables rather than give the best solutions. For more beautiful solutions using Binary Search, Dijkstra and Kruskal, watch Alex’s walkthrough (beginning 7:26) at https://www.youtube.com/watch?v=AM__3Zx1XNw&t=2325s&ab_channel=AlexWice
In a previous article https://medium.com/@hanqi_47643/scraping-excel-online-read-only-file-with-selenium-and-javascript-in-python-7bb549f05d66, I used Selenium to scrape this Excel Online file, but that felt a little indirect and slow, so here is a new attempt with new tools and knowledge gained. Full notebook at https://gist.github.com/gitgithan/b9f48e1b23e88f1fb1c56ad9b739adef
In the previous article, the strategy was to scroll, find, parse, scroll, find, parse,… Now, the goal is to send requests using Python requests library to directly target the information we want.
Begin by F12 to open Developer Tools → Network Tab on Chrome, then load http://www.presupuesto.pr.gov/PRESUPUESTOPROPUESTO2020-2021/_layouts/15/WopiFrame.aspx?sourcedoc=%7B566feecf-1e0d-46b8-a505-7cd762665268%7D&action=edit&source=http%3A%2F%2Fwww%2Epresupuesto%2Epr%2Egov%2FPRESUPUESTOPROPUESTO2020%2D2021%2FFOMB%2520Budget%2520Requirements%2520FY%25202021%2FForms%2FAllItems%2Easpx%3FRootFolder%3D%252FPRESUPUESTOPROPUESTO2020%252D2021%252FFOMB%2520Budget%2520Requirements%2520FY%25202021 or F5 reload page to see a list of Network Requests being recorded, we want…
This exercise was prompted by a question on a forum https://community.dataquest.io/t/how-to-download-an-excel-online-file/494093 regarding how to download a read-only file http://www.presupuesto.pr.gov/PRESUPUESTOPROPUESTO2020-2021/_layouts/15/WopiFrame.aspx?sourcedoc=%7B566feecf-1e0d-46b8-a505-7cd762665268%7D&action=edit&source=http%3A%2F%2Fwww%2Epresupuesto%2Epr%2Egov%2FPRESUPUESTOPROPUESTO2020%2D2021%2FFOMB%2520Budget%2520Requirements%2520FY%25202021%2FForms%2FAllItems%2Easpx%3FRootFolder%3D%252FPRESUPUESTOPROPUESTO2020%252D2021%252FFOMB%2520Budget%2520Requirements%2520FY%25202021 from excel online that required authentication to Download.
Copy pasting a few cells works fine, but Ctrl+A copy-pasting leads to just the text “Retrieving data. Wait a few seconds and try to cut or copy again.” being pasted with no data, making data analysis of the full file difficult. The follow sections will go through how to move around the document, get all the information, clean them, and put them together. Full notebook at https://gist.github.com/gitgithan/28f63f707bdbdd5dd9f51f553c6322dc…
— — — — — — — — Update on 15 Oct 2020 — — — — — — — — Congratulations! You are officially a Google Cloud Certified — Professional Machine Learning Engineer.
I tried a new set of 10 sample questions at https://cloud.google.com/certification/sample-questions/machine-learning-engineer
I’d say they are more difficult than 70% of the exam questions.
— — — — — ——— — — End of update — — — — —— — — — —
1 Aug 2020, I checked to see that the registration page which a week ago showed “we have sufficient beta test takers and…
In my voluntary role providing online technical support for www.dataquest.io, I come across numerous questions that allow me to dive deeper into interesting questions I usually skim through.
Today, the question is:
What’s the difference between
left_df.merge(right_df) vs pd.merge(left_df, right_df)?
The short answer is left_df.merge()
calls pd.merge()
.
The former is used because it allows method chaining, analogous to the %>%
pipe operator in R which allows you to write and read data processing code from left to right, such as left_df.merge(right_df).merge(right_df2)
. If you had to do pd.merge(), this is not the chaining style but wrapping style which ends up…
Full notebook at https://gist.github.com/gitgithan/0ba595e3ef9cf8fab7deeb7b8b533ba3
Alternatively, click “view raw” at the bottom right of this scrollable frame and save the json as an .ipynb file
In this article i will explore how dataframe.stack(), dataframe.melt(), dataframe.pivot_table from pandas data manipulation library of python interact with each other in a transformation pipeline to reshape dataframes and recover the original dataframe, along with numerous caveats along the way by following along the code.
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = “all”import pandas as pd
By default, jupyter notebooks only display the last line of every cell. The first two lines make jupyter display…