The Simple Act of Integrating a Weather API - Part 3

I revisited my solution, again. This time I got it. The cities.json file is 71KB instead of 19MB.

So where were we?!  In my last post, I found that some city names were in the alternate name field, but bringing in the entire alternate names blew up my file size. I did some work to get it to only bring in countries that I care about, and only ASCII alternate names, and got it down to 19MB. This was still too big. I had talked about doing what I am about to document.

I thought, I want to write something that just geocodes a bunch of locations that I give it. I can get this list from my database pretty easily. It would be like

select distinct city, state, country_abbr from tournaments

This will give me what I want. I can save it as a csv and modify my existing code to create a file with just these locations geocoded, based on the data I have from the geonames database, and the same manually added and modified cities from the previous post.

However, as my geonames application is pretty specific to what I set out to accomplish with it, this would need to be a new program. And instead of rewriting all of the parts that deal with the geonames data, I made those bits into a library within the geonames application. So I can include those in other programs. Code reuse is nice.

The solution I came up with is my program I called geocodecsv. Running that query in Azure Data Studio, it gives all of the distinct locations (550 or so), and I'm able to save that as a CSV file. This seemed fine.

It is a bear to call though! Thankfully, you have the ability to write scripts with computers since about 1970. So, I just have a powershell script with the following:

./geocodecsv -cities cities500.txt -countries countryInfo.txt -locations unique_locations.csv -modcities modifycities.json -addcities manualcities.json -addcountries manualcountries.json -out cities.json

It takes a few flags. The cities and countries files from geonames.org, the CSV of unique locations I want to geocode, and my original cities modifications and additions, as well as added countries. To get more info on those, you can view my past posts on this topic.

Part 1
Part 2

This then generates a file that has only the CITIES that I want. And I still use the alternate names but it's only populated, again, with the alternate names that I want! It's quite amazing if I do say so myself :)  It got my file from 19,933,999  bytes, down to 75,475 bytes. This has the added benefit that I can now open it up in text editors and look at it, where before it would crash my apps.

The original geonames is still good, may not be used anymore as an application, but I will use that as the central point to working with the geonames databases.

Happy coding!

blog comments powered by Disqus