'computer science' posts - Page 7 - Jason Connell

Go Concurrency ala Rob Pike December 18, 2014

I watched Rob Pike's talk, "Concurrency is not Parallelism", so I wanted to take what he was saying with his gopher example, and make a program that tightly followed his model.

My example program, in re-writing this website in Go, is the bit of it that gets photos from my Flickr account that I have tagged with "jtccom" in order to make them into header images. This utilizes the Flickr JSON API which is pretty easy to use.

There are multiple steps to using the Flickr API, three separate web calls, which makes this ideal for Go concurrency style programming. First step is to get the photos from my account tagged with "jtccom". This returns an array with photo ID and photo Secret. In order to get the URL for the photo, you have to get the sizes first. This is a separate call to the Flickr API, which returns an array of, among other things, Label and Source. Source is the URL, Label is the size name. In this case I'm only interested in the Original size, which has the "Original" label. The next part is to download the content pointed to by Source in the "Original" Size.

So the idea was to have a goroutine that gets photos (step 1), another one that gets sizes (step 2), and the other one that downloads content. Conceptually, this looks like this:

getPhotos(photos chan Photo) // pumps photos into the photos channel

getSizes(photos chan Photo, sizes chan []Size) // pumps sizes for each photo into the sizes channel after calling the API for the photos in the photos channel

downloadPhoto(label string, size chan []Size, photoContent chan PhotoContent) // download files of size 'label' from the Size channel and pump it into the PhotoContent channel

Realistically, it works pretty much in order because the calls to getPhotos and getSizes are done way before it's done downloading the content, as each file is around 9-12 MB, but at least the getPhotos and getSizes can pretty much run in parallel.

Code-wise, it looks very similar, just with go routines and some object style things, json parsing etc.

For clarity I broke out Flickr specific calls into a separate file, but not a separate package. Here's the "flickrsvc.go" file, with some hidden things like API key obfuscated.

package main

import ( "fmt" "time" "sync" )

func saveFiles(tmp, dest string, photoContent chan PhotoContent){ for photoContent := range photoContent { fmt.Println("Downloaded", photoContent.Photo.Id, "of size", len(photoContent.Content)) } }

func process(){ var apiKey = "blahblah" var userId = "28240873@N07" var tag = "jtccom"

var tmp = "../jtccom/content/tmp_download/" var destination = "../jtccom/static/images/backgrounds/"

procWG := sync.WaitGroup{}

photos := make(chan Photo) sizes := make(chan PhotoWithSizes) content := make(chan PhotoContent)

procWG.Add(3) go func(){ getPhotosByTag(tag, apiKey, userId, photos) close(photos) procWG.Done() }()

go func(){ getPhotoSizes(apiKey, photos, sizes) close(sizes) procWG.Done() }()

go func(){ downloadPhotos("Original", sizes, content) close(content) procWG.Done() }()

saveFiles(tmp, destination, content)

fmt.Println("wait procWG") procWG.Wait() }

func main(){

for { fmt.Println("going") process()

fmt.Println("wait wg")

fmt.Println("Sleeping") time.Sleep(3*time.Second) } }

And here is the output:

C:\Users\jconnell\Documents\go\src\jtccom.flickrsvc>jtccom.flickrsvc.exe
going Downloaded 14685510038 of size 9867146 Downloaded 14465862480 of size 11279714 Downloaded 14649298391 of size 9423168 Downloaded 14076004795 of size 8925512 Downloaded 13936652032 of size 14851399 Downloaded 12076007194 of size 14099167 Downloaded 11678436824 of size 9671802 Downloaded 11507180674 of size 13510941 Downloaded 11507190024 of size 11963353 Downloaded 11412952753 of size 13030709

Here is flickr.go (although it doesn't matter what it's called).

package main

import ( "strings" "net/http" "net/url" "encoding/json" "io/ioutil" )

type Response struct { Wrap Photos `json:"photos"` }

type Photos struct { Photo []Photo `json:"photo"` }

type Photo struct { Id string `json:"id"` Secret string `json:"secret"` }

type SizeArray []Size

func (sizeArray SizeArray) GetSize(label string) Size { var size Size for _,sz := range sizeArray { if strings.EqualFold(sz.Label, label) { size = sz break } } return size }

type SizesResponse struct { Wrap Sizes `json:"sizes"` }

type Sizes struct { Sizes SizeArray `json:"size"` }

type Size struct { Label string `json:"label"` Source string `json:"source"` }

type PhotoWithSizes struct { Photo *Photo Sizes SizeArray }

type PhotoContent struct { Photo *Photo Content []byte }

func getPhotosByTag(tag, apiKey, userId string, pchan chan Photo) { qs := url.Values{} qs.Add("method", "flickr.photos.search") qs.Add("api_key", apiKey) qs.Add("user_id", userId) qs.Add("tags", tag) qs.Add("format", "json") qs.Add("nojsoncallback", "1")

flickrUrl, _ := url.Parse("https://api.flickr.com/services/rest/?" + qs.Encode())

if resp,err := http.Get(flickrUrl.String()); err == nil { defer resp.Body.Close() decoder := json.NewDecoder(resp.Body)

photos := Response{} decoder.Decode(&photos)

for _, p := range photos.Wrap.Photo { pchan <- p } } else { panic(err) } }

func downloadPhotos(sizeLabel string, download chan PhotoWithSizes, downloaded chan PhotoContent) { for p := range download { url := p.Sizes.GetSize(sizeLabel).Source if resp,err := http.Get(url); err == nil { bytes,err := ioutil.ReadAll(resp.Body) resp.Body.Close()

if err != nil { panic(err) } else { pc := PhotoContent{ Photo: p.Photo, Content: bytes } downloaded <- pc } } else { panic(err) } } }

func getPhotoSizes(apiKey string, photos chan Photo, photoSizes chan PhotoWithSizes) { for p := range photos { qs := url.Values{} qs.Add("method", "flickr.photos.getSizes") qs.Add("api_key", apiKey) qs.Add("photo_id", p.Id) qs.Add("format", "json") qs.Add("nojsoncallback", "1")

if sizesUrl, err := url.Parse("https://api.flickr.com/services/rest/?" + qs.Encode()); err == nil { if resp,err := http.Get(sizesUrl.String()); err == nil { decoder := json.NewDecoder(resp.Body) sizeResp := SizesResponse{} decoder.Decode(&sizeResp) resp.Body.Close()

photoWithSizes := PhotoWithSizes{ Photo: &p, Sizes: sizeResp.Wrap.Sizes } photoSizes <- photoWithSizes } else { panic(err) } } } }

I had some problems where the Flickr methods would return channels and they weren't working. And I had to experiment with buffered vs unbuffered channels, internal sync.WaitGroups, and stuff that wasn't working out so well. I will play around with this more, since apparently you can use WaitGroup without using Channels. I definitely want to play more to get a better understanding and find out why stuff I was trying initially wasn't working. But it's working now, I just have to finish it by saving it to the destination folder, and checking if the image was already downloaded. For future me, this would be good to do with a func that takes a channel and outputs to another channel all of the files that haven't yet been downloaded, to keep with the passing channels paradigm I've used so far.

Hosting Multiple Go Websites Using Nginx December 17, 2014

So I've obviously been playing around with Go for a little while. I'd say now I've put in like 20-30 hours of good productive learning and coding in Go. I basically rewrote the blog display of this website in Go, connecting to MongoDB and stuff, using Go html/template, downloaded Gorilla Web Toolkit (AWESOME BTW), and tried to write very modular code that can be reused for other websites. However, there was that question of "other websites"? Each Go program compiles into its own program, calling the http.ListenAndServe() on the port specified, which when hosted on www.jasontconnell.com, would have to be 80.

I was playing with ideas in my head, like having a server.exe (obviously not .exe when I run it on Linux), which runs on port 80, and listens on another port for websites to register with it.

Server.exe starts up, jtccom.exe starts up, sends a message through RPC or some other network protocol that says "Yo, sup. If you would be so kind to send requests for jasontconnell.com to me, that'd be mighty generous of you." Server.exe would make note of the domain name and port that it's running on, and forward requests to it. This could also be done through a config file as well. But that would mean writing another full featured webserver in Go. I've already done one in Node.js, in Go it would be a bit easier since it's more fully featured as a webserver (including a template engine), and seems a bit faster. It wouldn't be as much work as doing it in Node because of the fact that templates are included (if you want to see some interesting code, ask me for my Node.js template engine code). But as John Carmack once said, "I don't think I have another engine in me".

Wanting to avoid writing another web server, I googled "host multiple Golang websites" (you have to add golang instead of go since go is such a generic term). I found this article, which is hosting Go websites with Nginx, and also covered a lot of other things I won't be doing. Using that article, I was able to download Nginx, set it up with some minimal configuration, and had two Go websites up and running successfully. I would have commented on that article to thank the author, but it required a login.

Here is the configuration in my nginx.conf file. This is within the main server node within the config (I also like how the config file is structured)

server { listen 80; server_name jtccom; location / { proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $remote_addr; proxy_set_header Host $host; proxy_pass http://jtccom:8080; } }

server { listen 80; server_name stringed; location / { proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $remote_addr; proxy_set_header Host $host; proxy_pass http://stringed:8081; } }

Then I was able to hit http://jtccom and http://stringed with both of those programs running.

This has a lot of implications. With Node, I was running everything as root. Since I can put these on ports 44444 if I wanted to, I can run these as a non-root user, increasing the security in the process. My other server where this site was hosted was hit with a virus or something that crashed the site for a few days. My dev machine is on Windows, so it's not immediate and I wasn't getting the teen response time when just hitting the Go process directly, but that should speed up on Linux when the sites are ready to launch. Another implication is that I can continue down my path of fully committing to Go for future development, since the hosting issue is solved. I could still go down the path of writing my barebones server with proxying capabilities in Go if the Nginx server doesn't work out completely, speed-wise.

I'm intrigued by the possibilities in Go. Writing small code that is mind blowing, compiled, fast, runs on multiple different systems, has a huge corporate backing, and is just fun to write. My first foray into it a couple of months ago was just "Hmm... I have no idea what I'm looking at", which is my brain's way of saying "you should learn that". Fun stuff.

Go Recursive Diff to Verify SVN Merge December 16, 2014

For some reason, SVN Merge over large trees, where work was simultaneously going on in both branches, is very unreliable. And, you can't tell quickly if something is wrong, especially if what is different is not compiled. This could be very bad if you merge your current work into the trunk, and deploy the trunk version live. It would require a full regression test.

Fortunately, Go exists, and is fast. However, with all of my tweaking of goroutintes and channels to try to get it to process each subdirectory in a separate goroutine, my efforts proved futile as I couldn't get it to be quicker than 9 seconds. There's a lot of content. I wrote it to ignore whitespace changes so that slowed it down immensely.

package utility

import ( "fmt" "crypto/md5" )

func MD5(content []byte) string { sum := md5.Sum(content) return fmt.Sprintf("%x", sum) }

That's the utility class that I use to just hash large swaths of content. Then here's the huge chunk of code that is my recursive file diff, approximately 170 lines of code.

package main

import ( "io/ioutil" "os" "fmt" "strings" "utility" "regexp" "time" )

type Dir struct { Name string FullPath string BaseDir *Dir Subdirs []*Dir Files []string Level int }

type FileResult struct { FullPath string Result bool }

var reg = regexp.MustCompile(`[\W]+`)

func readDirRecursive(base *Dir, types, ignore []string) { content, err := ioutil.ReadDir(base.FullPath)

if err != nil { return } for _, f := range content { name := f.Name() if f.IsDir() { addDir := true for _, ign := range ignore { addDir = addDir && !strings.EqualFold(name, ign) }

if addDir { sub := &Dir{ Name: name, BaseDir: base, FullPath: base.FullPath + `\` + name, Level: base.Level + 1} readDirRecursive(sub, types, ignore) base.Subdirs = append(base.Subdirs, sub) } } else { addFile := false for _, t := range types { addFile = addFile || strings.HasSuffix(name, t) } if addFile { base.Files = append(base.Files, name) } } } }

func spaces(times int) string{ return strings.Repeat(" ", times) }

func printDir (level int, dir *Dir){ fmt.Print(spaces(level) + dir.Name + "\n") for _, sd := range dir.Subdirs { printDir(level +1, sd) }

for _, f := range dir.Files { fmt.Println(spaces(level) + "- " + f) } }

func getContentMD5(file string) string { b,err := ioutil.ReadFile(file) if err != nil { fmt.Println(err) return nil }

s := reg.ReplaceAllString(string(b), "") return utility.MD5([]byte(s)) }

func compareFiles(file1, file2 string) bool { m1 := getContentMD5(file1) m2 := getContentMD5(file2) return m1 == m2 }

func compinternal (dir1 *Dir, dir2 *Dir, results chan FileResult) { for _, f := range dir1.Files { for _, f2 := range dir2.Files { if strings.EqualFold(f,f2) { result := compareFiles(dir1.FullPath + `\` + f, dir2.FullPath + `\` + f2) results <- FileResult{ FullPath: dir1.FullPath + `\` + f, Result: result} break } } }

for _, sd1 := range dir1.Subdirs { for _, sd2 := range dir2.Subdirs { if strings.EqualFold(sd1.Name, sd2.Name){ sdchan := make(chan FileResult) go func(){ compinternal(sd1, sd2, sdchan) close(sdchan) }()

for sdresult := range sdchan { results <- sdresult }

break } } } }

func rcomp(dir1, dir2 string, filetypes []string, ignore []string) []string { diffs := []string{}

left := &Dir{ Name: "Root Left", FullPath: dir1, Level: 0, BaseDir: nil } right := &Dir{ Name: "Root Right", FullPath: dir2, Level: 0, BaseDir: nil }

readDirRecursive(left, filetypes, ignore) readDirRecursive(right, filetypes, ignore)

resultChannel := make(chan FileResult) go func (){ compinternal(left, right, resultChannel) close(resultChannel) }()

for result := range resultChannel { if !result.Result { diffs = append(diffs, result.FullPath) } }

return diffs }

func main(){ args := os.Args[1:]

if len(args) < 4 { fmt.Println("need right and left directories, file types to include, folder names to ignore") return }

start := time.Now().Unix()

types := strings.Split(args[2], ";") ignore := strings.Split(args[3], ";")

fmt.Println("Grabbing files " + strings.Join(types, ", ")) fmt.Println("Ignoring folders " + strings.Join(ignore, ", "))

diffs := rcomp(args[0],args[1], types, ignore) for _, diff := range diffs { fmt.Println(diff) }

end := time.Now().Unix()

total := end - start

fmt.Println(total, "seconds taken") }

Running rdiff

I later added a counter to see how many files it's actually comparing. In the project I wrote this app for, the file count of those included (.js, .cs, etc) was 2,794.

Compared Count

I could stand to clean up the output a bit, but it helped me identify a few files that were out of date with the branch. Thanks svn. And thanks Go!

More Go MongoDB Testing Code December 12, 2014

This is generally moving towards how I would structure a final product that was written with MongoDB as the backend. It's big, so I'm going to play with new ways of putting code on my site. Here it is on github

Seriously Kindle Reader for PC December 12, 2014

This is an example of how Kindle for PC shows code snippets in the Go book I'm reading. It's seriously detrimental to my learning.

func (count *Count) Increment() { *count++ } ???

Luckily it works fine on my Nexus 9.

Go and MongoDB Initial Test December 11, 2014

This was so easy. Going against the database that runs this site:

package main

import ( "fmt" "gopkg.in/mgo.v2" "gopkg.in/mgo.v2/bson" )

type Post struct { Body string Title string `bson:"title"` Tags []string }

func main() { session, _ := mgo.Dial("localhost") defer session.Close() db := session.DB("jtccom") posts := db.C("posts") var first Post posts.Find(bson.M{}).One(&first) fmt.Println(first.Title) }

Here we go.

Go Programming Language and Kindle Unlimited December 10, 2014

Always learning. I look at Google's Go programming language, and at first it's new syntax, there's a few foreign things to me, I take this as a challenge that I want to overcome. I will know this language. Eventually. I have aspirations of doing everything I've done in Node.js on it. As Perl started it all, and Classic ASP replaced Perl, Java replaced Classic ASP, and Node.js replaced Java (in my "side language" progression, languages I've learned that haven't earned me a dime [other than in the aspect that I've grown and stretched my brain to think differently]), possibly eventually Go will replace Node.js. It seems more "grown up" to me. Back when I was in college, learning C++ in those early days, and then never having a difficult-to-comprehend language to deal with after that. Ideally this was because languages have been getting simpler, 4GL, more abstraction with regards to references and memory management, multi-threading is cake (a cake with a shotgun hidden inside).

Go seems neat. These things that arise, though, when starting to learn a new language, and forgetting about them because they are initial growing pains, like setting up the environment in order to work best with the new language. Now I have everything set up for Node development that I don't have to think about it, I just get in there and start hacking. And immediately you forget about what it took to get it to that point, and you get to a new language, and you're like, man, this is much worse than what Node was, is it worth it? The answer is it's not worse, I just forgot about it and don't have to worry about it anymore.

Node was less of a leap, as well, since the language isn't new, there isn't any new syntax, I've been doing Javascript for a decade... (Although that doesn't necessarily hold true when learning Javascript APIs like Angular) However, Go is a bit of a leap. New syntax, new way of thinking, different environment setup. These are my favorite challenges. So I started looking on the web for how to do Go. In all honesty, I'm still lost :) I have my Go Workspace set up with the bin, pkg and src directories, and the Go environment variable to tell where this is. I just need to inject as much knowlege as I can when it comes to programming Go. Which brings me to Kindle Unlimited.

Kindle Unlimited. So much promise. Being bright eyed and wanting to learn Go, I wanted to see if any of the books out there were available on Amazon Kindle. To my delight, they were. And I had received emails from Amazon announcing Kindle Unlimited over the course of the past few weeks or whatever. So I wanted to buy the ticket and take the ride. They offered a free trial month also, which made this a zero pain investment. There are plenty of good Go books out there, and available on Kindle, that I could be reading for 6 months and still be getting more value out of Kindle Unlimited than what I was paying.

So I signed up. Then I searched for the book I was looking to read, "Programming in Go: Creating Applications for the 21st Century", which seemed like a good start. I didn't set out to create apps for the 20th or 19th centuries, and there's 85 years left in the 21st, this seemed on par as to where I wanted to go. Strangely there's no "send to my Kindle" button, since now I'm signed up for "Unlimited", meaning not having a limit, and I would like to read this book. So I go on the Kindle app on my phone, and notice there's a category for Kindle Unlimited, meaning books are categorized as Kindle Unlimited, certain books are picked to be in the Unlimited category, so only certain books (700,000) are available as unlimited, and the book I wanted to read is not available. So I looked at what Go books were available on Kindle Unlimited. There were some but I really need the beefy, 400+ pages that are offered on the book I mentioned earlier in this paragraph.

I cancelled my Kindle Unlimited subscription after 8 minutes.

Fix for Angular 1.3 $resource Not Stripping $ properties anymore October 2, 2014

Version 1:

module.config(["$httpProvider", function($httpProvider){ $httpProvider.defaults.transformRequest.unshift( function(data){ if (typeof data == "object"){ var remove = []; for (var i in data) { if (i.indexOf("$") == 0 && i.indexOf("$$") == -1 && typeof(data[i]) != "function") remove.push(i); } remove.forEach(function(k){ delete data[k]; }); } return data; }); }])

Version 2:

module.config(["$httpProvider", function($httpProvider){ $httpProvider.defaults.transformRequest.unshift( function(data){ if (typeof data == "object"){ var copy = angular.copy(data); var remove = []; for (var i in copy) { if (i.indexOf("$") == 0) remove.push(i); } remove.forEach(function(k){ delete copy[k]; }); return copy; } return data; }); }])

Let's examine. There's a bit going on here.

$resource registers its own transformRequest functions before ours is defined, but I want mine to load first.

$httpProvider.defaults.transformRequest.unshift

Unshift puts my method at the beginning. The data at this point is an object, whereas if I were to push it to the end, after the $resource transformRequest function, I get a string. It seems a bit more efficient to work with the object first, than to allow $resource to JSON.stringify the data, then for me to load it up through JSON.parse after.

The main difference between Version 1 and Version 2 is that I copy the object in Version 2 so I can delete any property that begins with $, whereas in Version 1 I skipped deleting functions and properties that Angular needed, but then the reference objects were deleted from my actual object, so the list view got messed up, because it's looking for those properties. So Version 2 is correct.

I add this to my global module, the one that gets bootstrapped to <html>, so that it only has to be registered once, like so:

;(function(angular){ var module = angular.module("global.app", [...]); module.config(["$httpProvider", function($httpProvider){ $httpProvider.defaults.transformRequest.unshift( function(data){ var copy = angular.copy(data); if (typeof copy == "object"){ var remove = []; for (var i in copy) { if (i.indexOf("$") == 0) remove.push(i); } remove.forEach(function(k){ delete copy[k]; }); } return copy; }); }]) })(angular);

And then bootstrap it

;(function(angular, document){ angular.element(document).ready(function(){ angular.bootstrap(angular.element("html"), ["global.app"]); }) })(angular, document);

Simple

No more automatic $promise unwrapping October 2, 2014

Like any Philadelphia sports fan says, "BOOOOO!!"

In AngularJS, I liked the cleanliness of the code:

$scope.objects = ObjectAdmin.query();

And as it automatically unwraps the promise, the data updates correctly on the view side, and everything works fine. But I started playing with Angular 1.3 and it no longer does this. Now you have to call:

ObjectAdmin.query().$promise.then(function(data){ $scope.objects = data; })

Which is just ugly. But I guess the team at Angular made the decision for a good reason, but I have to update all of my code. Booooo!!!

This became painfully obvious when attempting to group a multiple select options by a property that is bound as a foreign key, after the promise has resolved.

You can imagine this data structure:

object: { name: "Test", objectClass: "a_foreign_key" }

objectClass: { key: "a_foreign_key", name: "Object Type" }

Probably easier to think about in a non abstract way (although, in my current project, "Object" is the actual name of the item I'm dealing with).

person: { id: 1, name: "Jason", professionId: 1 }

profession: { id: 1, name: "Software Developer" }

So, when I get the list of "persons", and the list of "profession", I would grab the reference to the profession referenced by the professionId in the person, then assign it to a new property on each person, $professionRef.

so person 1 would like like this: { id: 1, name: "Jason", professionId: 1, $professionRef: { id: 1, name: "Software Developer" } }

THEN my select list would use the ng-options to this effect:

ng-options="person.name group by person.$professionRef.name for person in persons track by person.id" ng-model="office.bestSmellingDeveloper"

(The use of $professionRef and $resource no longer automatically removing $ properties on POST/PUT is another pain point for me :)

So at the time the $promise resolves, the select list is like F#@% YEAH DATA!!! And it binds itself, but $professionRef isn't updated yet. So it's not grouping by anything and you have a bland list of things. I use the chosen jquery plugin which just makes these look beautiful with bootstrap of course, and I get bummed when it looks fugly.

I realize this wouldn't have worked as is, even with $promise unwrapping, it all depends on when the SELECT chooses to bind its data, and it would typically be immediately after the promise resolves. Really where this affected the code the most was in the chosen plugin I wrote, which looks like this now that I'm not binding promises to it anymore.

module.directive("ngChosen", function($parse, $timeout){ return { restrict: "A", require: "ngModel", link: function(scope, element, attrs, ngModel){

scope.$watch(attrs["ngChosen"], function(){ $timeout(function(){ element.trigger("chosen:updated"); }); });

scope.$watch(attrs["ngModel"], function(){ $timeout(function(){ element.trigger("chosen:updated"); }) }); element.chosen(); } } });

But used to look like this:

module.directive("ngChosen", function($parse, $timeout){ return { restrict: "A", require: "ngModel", link: function(scope, element, attrs, ngModel){ var chosen = scope.$eval(attrs["ngChosen"]);

if (chosen.$promise) { chosen.$promise.then(function(){ $timeout(function(){ element.trigger("chosen:updated"); }); }) } else { scope.$watch(attrs["ngChosen"], function(){ $timeout(function(){ element.trigger("chosen:updated"); }); }); }

scope.$watch(attrs["ngModel"], function(){ $timeout(function(){ element.trigger("chosen:updated"); }) }); element.chosen(); } } });

But now that Angular is no longer automatically unwrapping promises... well, I guess I could keep the promise unwrapping in the chosen plugin, just in case I have a simple case that I need to bind to it (hardly ever the case), but since I won't be binding $promise objects to select lists most of the time, I can just say I won't bind any even though it's easy. Because I'll be used to unwrapping them manually to handle cases like the aforementioned.

Enjoy!

Meta-Bits July 30, 2014

C# calls them attributes
java calls them annotations
i'll call mine meta-bits
MBOP
classic Hanson

You know you're a code snob when April 9, 2014

Today I wrote a method that takes an updated object from the server and merges it with the current user's objects. These objects are in an hierarchy of client -> environment -> credentials. So I wrote a function that sees that a credential has been updated (deleted, inserted, updated), finds the client and environment that it belongs to, if it doesn't have it, it "pushes" it, otherwise if it's not a delete, it updates it, and if it's a delete it removes it. The other users can be editing clients or environments as well, so these have to work the same way.

The commit message for this change is as follows:

Update other users' data with data updated from the server. Need to clean up the compare function, in client controllers line 65 to 118

Yep, 53 lines of non-trivial code and I'm all like "That is WAY too long..." I will write a generic utility function that will do it for me. I found angular.copy to be very useful, I just need something to tell me what's changed.

I will need that function to keep track of users logging in and out as well, as it updates almost immediately for other users when someone logs out, and it uses way more code than I feel should be necessary as well.... If javascript had a universal "findIndex" method, it would be helpful, but I want to make it a 2-3 liner for these updates without writing my own "findIndex" function. More specialized...

Tag List Added October 8, 2013

I recently went about aggregating the tags used on my posts to create a sort of tag cloud. I never liked the display of tag clouds, so I just list them out in order of occurrence, with the most frequent showing first.

This should help me get some traffic. Node.js and MongoDB are super fast. It doesn't even stutter when loading the site, across 500+ posts. Actually, I have no idea how many there are. Super fast.

Here's the code which pretty much finishes in -5 seconds

var db = require("../db");

this.tagCloud = [];

this.initialize = function(site, callback){
	var self = this;
	while (self.tagCloud.pop());

	db.getPosts(site.db, {}, -1, -1, function(posts){
		var tags = {};
		posts.forEach(function(post){
			if (post == null) return;
			for (var i = 0; i < post.tags.length; i++){
				if (tags[post.tags[i]] == null)
					tags[post.tags[i]] = { tag: post.tags[i], count: 0 };

				tags[post.tags[i]].count++;
			}
		});

		for(var tag in tags){
			if (tags[tag].count > 8) // arbitrary limit so we don't list like 200 tags with 1 post each
				self.tagCloud.push(tags[tag]);
		}

		self.tagCloud.sort(function(a,b){ return b.count - a.count; });
		callback();
	});
}

This one is a C# post May 11, 2013

At work, I was working on cool stuff, but then my boss was like "I need this report and this report and this report. Thanks."

I'm not one to turn down such a politely worded and completely fictitious request. Reports are easy until the requests become stuff like "Include subtotal line for every Apple category and Orange category"

My data set was obviously not Apples and Oranges, but here's what I did to quickly and easily make subtotals for each of these

First, I made some C# Attributes, which are nice when you like to work in the meta.

public class MyReportItem {
  [SubtotalGroup(GroupName = "Fruit Type")]
  public string FruitType { get; set; }

  public string FruitName { get; set; }

  [SubtotalSum]
  public int Count { get; set; }

  [SubtotalAverage]
  public int SalesPerDay { get; set; }

  [SubtotalSummaryDesignator]
  public bool IsSubtotalLine { get; set; }

  [TotalDesignator]
  public bool IsTotalLine { get; set; }
}

Your SQL might look like this:

select FruitType, FruitName, StockQty, SalesPerDay from Fruits order by FruitType, FruitName

So your data looks like this

'Apple', 'Mcintosh', 12, 80
'Apple', 'Delicious Red', 22, 50
'Orange', 'Some Orange Name', 33, 90

The code I wrote allows that data to be quickly, easily, and automatically shown like this:

'Apple', 'Mcintosh', 12, 80
'Apple', 'Delicious Red', 22, 50
'Apple Subtotal', '', 34, 65
'Orange', 'Some Orange Name', 33, 90
'Orange Subtotal', '', 33, 90

Notice the "SalesPerDay" column has an average attribute on it, not a sum. Here's the meat of my code, after getting the attributes and the data all figured out.

	public List<T> PopulateSubtotalItems()
{
	List<T> withSubs = new List<T>();
	if (this.list.Count == 0) return withSubs;

	// allow multiple group by with subtotals. e.g. group by Fruit Name and say fruit type, like "Citrus"
	// to subtotal Oranges and subtotal Limes and then subtotal Citrus
	List<GroupSub<T>> subs = new List<GroupSub<T>>();  

	foreach (string key in this.groupBy.Keys)
	{
		T sub = new T();
		GroupSub<T> groupSub = this.groupBy[key];
		groupSub.SubRecord = sub;
		// sets the properties which designate the group. So this subgroup might set FruitType to "Apple"
		this.SetGroup(groupSub, this.list[0]);
		// sets the bool property which the subtotal designator is on to true.
		this.SetSummary(groupSub);  

		subs.Add(groupSub);
	}

	// if there's a bool property with the "TotalDesignator" attribute, include total
	GroupSub<T> totals = null;  
	if (this.includeTotal)
	{
		T sub = new T();
		totals = new GroupSub<T>();
		totals.SubRecord = sub;
		totals.IsTotal = true;
		this.SetTotal(totals);  // sets the property which the TotalDesignator is on to true
	}

	subs = subs.OrderBy(grp => grp.Sequence).ToList();

	int grpCount = 0;

	for (int i = 0; i < this.list.Count; i++)
	{
		bool added = false, last = i == this.list.Count - 1;
		foreach (GroupSub<T> grp in subs)
		{
			bool same = SameGroup(grp, this.list[i]);
			if (!same)
			{
				this.Average(grp, grpCount); // set the average properties to the sum / grpCount

				withSubs.Add(grp.SubRecord); // add the subtotal record to the group

				grpCount = 0;  // start afresh
				grp.SubRecord = new T();
				this.SetSummary(grp);
				SetGroup(grp, this.list[i]);
			}

			Increment(grp, this.list[i]);

			if (last) // special handling on the last one.
			{
				this.Average(grp, grpCount);
				if (!added)
				{
					Increment(totals, this.list[i]);
					withSubs.Add(this.list[i]);
					added = true;
				}
				withSubs.Add(grp.SubRecord);
				grpCount++;
			}
		}

		if (!added)
		{
			Increment(totals, this.list[i]);
			withSubs.Add(this.list[i]);
			added = true;
			grpCount++;
		}
	}

	// add the total line
	if (this.includeTotal)
	{
		this.Average(totals, this.list.Count); // average the total record
		withSubs.Add(totals.SubRecord);
	}

	return withSubs; // that's it!!
}

As you can see, I no longer have to dread doing subtotals on reports!

Anti Code Generation March 19, 2013

I've been anti code generation ever since always. I have some good reasoning behind it. Recently, though, the company I work for has inherited code from another firm (always bad in my experience), and a lot of it was generated from an internal tool, and it made me think about why I have been against it, and made me more firm in my stance.

We actually use code generation for some projects, but we're smart / not wasteful about it.

Here's my selling point. If one becomes dependent on code generation, their data architecture can suffer grave consequences, in being inefficient, and generally not thought out. They will create any old architecture because it is of no cost to them, in terms of time. If you've been doing code generation against data structures, you have no reason to design an efficient one, reuse concepts across the entire structure, optimize for efficiency, and you might be stuck with certain data types and assumptions about past data that your code generates to, that you might even get stuck.

Some explanation. CRUD operations are generally easy to generate. You generate an insert stored procedure, an update (or combine them), a select and a delete. This is for one object in your database. So you have a customer object, with an address foreign key, now you have to get the Customer->AddressID and then do another call to get the Address data. But sometimes you don't want customer and address at the same time. So now you have two stored procedures. When I wrote an ORM, this was one thing I nipped in the bud. It would do the join and get all data for customer and address in one call, if you passed in "load references = true" to the load method.

A bit around the not-thought-out part... I take great care in designing a data architecture, and I'm just a lowly software developer! I kid, of course, the data structure is at the crux of what I do. If I have a shitty data structure, I can't work with it. One recent example was an application of sorts, and a wizard with steps 1-4. It was inherited code and data structure which we couldn't change. The main application part with the customer data didn't have a date field for created or updated date time. Those were stored in a log table. So to get the time that the application was created, you had to do this:

Select app.* from app join app_log on app.id = app_log.app_id where app_log.date = (select min(date) from app_log where app_id = app.id) order by app_log.date desc

That inner select is a killer. We'll hopefully get both a createDate and updateDate fields on the app table in the near future. This actually wasn't a product of code generation but more of an example of the bad code that we inherit.

More of the not thought out stuff, since that last one wasn't to do with code gen... Sometimes there will be repeated fields or concepts. Some people get an Address table and if it's going to be a slightly different address, add another address table with all of the fields from address, and then other fields that they needed. Hey, it's simple to just generate the new code!! But the structure is repeated. I will try my damnedest to not repeat code or a data structure. If something has an int and a string, and another thing has an int and a string, they will both inherit from a base class that has an int and a string. Goddamnit!

Copying and pasting code is worse, but we're not talking about that.

Moving on to the "you might get stuck with certain data types and assumptions" part of my thesis. By assumptions, I mean, assumptions that were made when the code gen tool was written. For instance, the ID field must be an int. -1 means a null int. A lookup type table (basically an enumeration) must have display value and an internal value (usually matching up with a code enumeration) and must have int ids. Our old code generation generated stored procedures, if a parameter was a bit field, 0 meant "don't care" where 1 meant "where this field == 1". You couldn't filter based on where that value was 0. If you wanted to say 0 was the more significant one, you would name the column in a negative way. For instance, a "Deleted" field, where 0 means not deleted. You couldn't get just not deleted records. So you would have to name the column "NotDeleted", which is crap.

I was recently looking at my old code from college, it's great :) I remember my professors and how they molded me into the programmer I am today. Then all the many many hours I spent honing my skills. I wanted REUSABLE code. I never wrote a code generator for personal use. I have modified the one we use at work to be better and more acceptable, according to my standards. I'm tired and I'm going to bed...

Habits February 24, 2012

My variable/class/filename/etc naming habits have changed recently, I'd say in the past year. Previously, I'd name CSS classes, html element ids, etc, in camel case. But I've moved towards hyphen delimited.

Old way: someObjectNameOrID

New way: some-object-name-or-id

This is for file names as well. It's just one of those transitions that takes place where I'm not sure how I feel about it, and I definitely haven't taken the time to think about the possible repercussions. I'm just going with the flow. If I could name C# classes that way, I probably would.

Go Concurrency ala Rob Pike December 18, 2014

Hosting Multiple Go Websites Using Nginx December 17, 2014

Go Recursive Diff to Verify SVN Merge December 16, 2014

More Go MongoDB Testing Code December 12, 2014

Seriously Kindle Reader for PC December 12, 2014

Go and MongoDB Initial Test December 11, 2014

Go Programming Language and Kindle Unlimited December 10, 2014

Fix for Angular 1.3 $resource Not Stripping $ properties anymore October 2, 2014

Version 1:

Version 2:

No more automatic $promise unwrapping October 2, 2014

Meta-Bits July 30, 2014

You know you're a code snob when April 9, 2014

Tag List Added October 8, 2013

This one is a C# post May 11, 2013

Anti Code Generation March 19, 2013

Habits February 24, 2012

Archive

Tags