Be careful with argument order and return values

Image for post
Image for post
Photo by Andrea Ferrario on Unsplash

I was working on this leetcode question https://leetcode.com/contest/weekly-contest-212/problems/path-with-minimum-effort/ using backtracking and spent some time debugging strange output. This article discusses some pitfalls when using recursion with global variables, how to handle them, and how to change the code from global to local.

Disclaimer: Backtracking only passing 15/75 test cases and Time Limit Exceeded for the rest, the purpose of this article is to highlight possible issues with global variables rather than give the best solutions. For more beautiful solutions using Binary Search, Dijkstra and Kruskal, watch Alex’s walkthrough (beginning 7:26) at https://www.youtube.com/watch?v=AM__3Zx1XNw&t=2325s&ab_channel=AlexWice

Problem


Hit the nail on the head

Image for post
Image for post
cmdCopy as cURL(bash) to formulate a request

In a previous article https://medium.com/@hanqi_47643/scraping-excel-online-read-only-file-with-selenium-and-javascript-in-python-7bb549f05d66, I used Selenium to scrape this Excel Online file, but that felt a little indirect and slow, so here is a new attempt with new tools and knowledge gained. Full notebook at https://gist.github.com/gitgithan/b9f48e1b23e88f1fb1c56ad9b739adef

Creating the request

In the previous article, the strategy was to scroll, find, parse, scroll, find, parse,… Now, the goal is to send requests using Python requests library to directly target the information we want.

Begin by F12 to open Developer Tools → Network Tab on Chrome, then load http://www.presupuesto.pr.gov/PRESUPUESTOPROPUESTO2020-2021/_layouts/15/WopiFrame.aspx?sourcedoc=%7B566feecf-1e0d-46b8-a505-7cd762665268%7D&action=edit&source=http%3A%2F%2Fwww%2Epresupuesto%2Epr%2Egov%2FPRESUPUESTOPROPUESTO2020%2D2021%2FFOMB%2520Budget%2520Requirements%2520FY%25202021%2FForms%2FAllItems%2Easpx%3FRootFolder%3D%252FPRESUPUESTOPROPUESTO2020%252D2021%252FFOMB%2520Budget%2520Requirements%2520FY%25202021


Experience the joy of human-machine cooperation

Image for post
Image for post

This exercise was prompted by a question on a forum https://community.dataquest.io/t/how-to-download-an-excel-online-file/494093 regarding how to download a read-only file http://www.presupuesto.pr.gov/PRESUPUESTOPROPUESTO2020-2021/_layouts/15/WopiFrame.aspx?sourcedoc=%7B566feecf-1e0d-46b8-a505-7cd762665268%7D&action=edit&source=http%3A%2F%2Fwww%2Epresupuesto%2Epr%2Egov%2FPRESUPUESTOPROPUESTO2020%2D2021%2FFOMB%2520Budget%2520Requirements%2520FY%25202021%2FForms%2FAllItems%2Easpx%3FRootFolder%3D%252FPRESUPUESTOPROPUESTO2020%252D2021%252FFOMB%2520Budget%2520Requirements%2520FY%25202021 from excel online that required authentication to Download.

Copy pasting a few cells works fine, but Ctrl+A copy-pasting leads to just the text “Retrieving data. Wait a few seconds and try to cut or copy again.” being pasted with no data, making data analysis of the full file difficult. The follow sections will go through how to move around the document, get all the information, clean them, and put them together. …


A journey of throwing oneself in the deep end

Image for post
Image for post
Photo by William Ferguson on Unsplash

— — — — — — — — Update on 15 Oct 2020 — — — — — — — — Congratulations! You are officially a Google Cloud Certified — Professional Machine Learning Engineer.

I tried a new set of 10 sample questions at https://cloud.google.com/certification/sample-questions/machine-learning-engineer
I’d say they are more difficult than 70% of the exam questions.
— — — — — ——— — — End of update — — — — —— — — — —

1 Aug 2020, I checked to see that the registration page which a week ago showed “we have sufficient beta test takers and registration is closed” is surprisingly active again. I looked through the exam booking calendar to see the latest date on 21 Aug 2020, after which even scrolling till Aug 2021 presented no available slots. …


A deep dive of the source to uncover import design patterns

Image for post
Image for post
Photo by Pascal Müller on Unsplash

In my voluntary role providing online technical support for www.dataquest.io, I come across numerous questions that allow me to dive deeper into interesting questions I usually skim through.

Today, the question is:
What’s the difference between
left_df.merge(right_df) vs pd.merge(left_df, right_df)?

The short answer is left_df.merge() calls pd.merge() .

The former is used because it allows method chaining, analogous to the %>% pipe operator in R which allows you to write and read data processing code from left to right, such as left_df.merge(right_df).merge(right_df2). If you had to do pd.merge(), this is not the chaining style but wrapping style which ends up with an ugly pd.merge(pd.merge(left_df,right_df),right_df2)


3 ways to shake up your data representation and recover

Image for post
Image for post
Photo by Outer Digit on Unsplash

Full notebook at https://gist.github.com/gitgithan/0ba595e3ef9cf8fab7deeb7b8b533ba3
Alternatively, click “view raw” at the bottom right of this scrollable frame and save the json as an .ipynb file

In this article i will explore how dataframe.stack(), dataframe.melt(), dataframe.pivot_table from pandas data manipulation library of python interact with each other in a transformation pipeline to reshape dataframes and recover the original dataframe, along with numerous caveats along the way by following along the code.

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = “all”
import pandas as pd

By default, jupyter notebooks only display the last line of every cell. The first two lines make jupyter display the outputs of all variables as a convenience to avoid wrapping print() around every single variable I wish to see. …

Han Qi

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store