These videos are lectures and demos from the Programming for Cultural Heritage course. This class focuses on building skills in the Python programming language, we often use resources from cultural heritage (museums, libraries) but it is not limited to that domain. Lecture videos cover a general area of interest and are cumulative building off of previous videos. The demo videos are shorter videos on a specific task that use skills learned in the lectures.
Created by Matt Miller
This work is licensed under a Creative Commons Attribution 4.0 International License.
Short lecture on using the command line on mac and pc, focus on using tools to parse and extract data from CSV files. The goal of this video is to get comfortable using the command line. We use data from the NYPL What’s on the Menu project and Open NYC Data
How to get started installing python on your computer. Using Mac + Windows, also configuring visual studio code option and showing python notebooks.
We look at the basics of python and go over many facets of the langauge like variables, if statements, for loops, etc.
We look at reading and writing files using the Open function and then reading CSV files using the CSV module.
We look at using the JSON module to load large JSON files and loop through them.
We're writing a CSV file in this video. Specifically reducing a very large 2GB CSV to a smaller CSV file.
A quick look at using the json module to write out to json files.
We go through what XML is, how to read and write it in python and do a challenge involving ETL (extract, transform, load) script where we transform a CSV into an EAD XML finding aid.
To use external modules (libraries) in python you need to install them. This video shows how to use pip to install them on mac, windows, visual studio code and google colab notebooks.
We dive into interacting with APIs via python and the requests module. We look at working with the Smithsonian Museums API and try querying it.
An introduction to writing Regular Expressions in python
We use python to web scrape two sites, the Frick museum and the Milwaukee Art Museum. Lots of details on problems you can encounter while web scraping.
We look at using Git and Github on Mac and PC.
We look at using python to interact with Wikidata. We run SPARQL queries from python, both using general queries and lookup by Q Id. And access the results and look retrieve entity data from the Special:EntityData endpoint.
We use the python module gpt-2-simple (https://github.com/minimaxir/gpt-2-si...) to generate text using the 124M model. We run into some problems with needed to run different version of python so we install pyenv to run an older version of python with an older version of tensorflow.
In this video we create 2(!) twitter bots, a random taco bot and a Niles and Frasier dialog bot. We also build an AWS Lambda for them to run on. This is a long video but there are many sections:
Intro - About the different types of twitter bots and their requirements
08:00 - Start building the data for our TACO BOT
35:00 - Start building the twitter post a AWS Lambda setup
1:04:45 - Start building the Niles and Frasier dialog bot using Beautiful Soup to do web scraping to build the data.
We use the Google Places API to look up information about restaurants in NYC in a specific geographical area.
We retrieve data from the Genius API using the LyricsGenius python module
We look at using python to work with Google Sheets. Reading data and writing data. We access the sheet as a JSON file and use the pygsheets modules to update data in a sheet.
A quick look at MARC XML parsing using pymarc using data from https://data.nls.uk/.
A brief intro to writing functions. We write a wrapper for the Brooklyn Museum API using our own functions.
We use the Harvard Art Museums API to download data locally and use glob
to parse the records.
We also look at https://americanarchive.org to download XML PBCore files using web scraping and glob to manage the files.
We register a domain and setup Github Page hosting.