Writing an API in Golang

Since graduating from college, I’ve mostly written my side-projects in Python. The language has always enabled me to write a quick proof of concept and, if I really wanted to, make something ‘suitable for production’ with it. However, compared to writing Java and C++, I’ve always felt like I wasn’t getting everything that I could out of my code.

Therefor, instead of going through my outdated ‘Modern C++’ O’Reily book, I decided to spend a part of my holiday porting an existing project over to Golang. For learning, efficiency and specially… FUN!

Sentiment Analysis

The project I am going to port is a part of my scraping pipeline. I post content to multiple social media websites and I am eager to get statistics on if the content is receiving positive or negative feedback. Because there is not a singular process using this service, I’ve set up a Flask server that accepted a list of comments with an identifier, and it would return the sentiment score for that comment, including meta-data. This was handled by the NLTK module.

The rewrite was to keep the same business functionality. Receive a list of comments, return for each comment the sentiment score. This should ensure that nothing breaks, but there was no testsuite written for this. In hindsight, I do regret not making a more refined list of performance or writing unit tests. There is no comparison in performance nor are there any statistics for me to point out how good the rewrite is compared to the old one.

But as it is a one-man show and the sentiment analysis earns me zero money (probably cost me money per kWh instead!), I am ultimatley not losing any sleep over it.

The data model

Our data model is straightforward. We send a json object named “commentlist”, which is an array of comment objects. These consist of an identifier and the text.

{ 
    "commentlist":[
        {"id":"1", "text":"What a nice article you wrote!"}
    ] 
}

In return, the response is another array of objects with the comment’s identifer and the matching score. The requestor needs to have the actual text, our service does not care for it.

{ 
    "data":[
        {"id":"1", "score":"-0.5"}
    ] 
}

Happyflow

To ensure a basic functionality, I wrote a happyflow to see if it worked. This was a simple bash script with curls containing made-up comments to see if the API performed it’s core function.

curl -X POST $HOST -H "Content-Type: application/json" -d '{"commentlist":
    [ { "id":"1", "text":"I like your article!" } ]
}'

However, I’m argueing with myself wether or not these are tests. This happyflow is so high level that I think it should not count as ’tests’. Nothing targets the specific functions of the code that I wrote. Instead, it’s going over the entire implementation. If, for some reason, I would change some part of my flow, it would be harder for me to find which part I fumbled because of my rewrite. That’s what testing is for me. Preventing regression at a certain granularity.

The webserver

Having Go start up a http server with routes was surprisingly easy very to the point. Using the the built-in library it was very straightforward to make it listen to the designated interface and port with configured routes with methods.

Here, I am defining a http.Server to listen on localhost and port 8080.

mux := http.NewServeMux()

server := http.Server{
    Addr:    "localhost:8080",
    Handler: mux,
}

log.Fatal(server.ListenAndServe())

Next, writing the paths and methods it could handle was also a piece of cake. For this, the http object uses a multiplexer, which you can find the documentation for HTTP request multiplexer.

We can define the paths using the pattern ‘[method] [path]’, followed by the handler function. The first one I will define is a keepalive healthpoint, which can be used to see if the API is up and running.

mux.HandleFunc("GET /api/keepalive", handler.HealthCheck)

We want our clients to receive a “KEEPALIVE_OKAY” when they do a GET request to /api/keepalive. The handler function requests two arguments, which are passed along by the webserver. These are http.ResponseWriter, which handles the response, and http.Request which contains the http request send by the client.

func HealthCheck(w http.ResponseWriter, r *http.Request) {
	w.Write([]byte("KEEPALIVE_OKAY"))
}

As you can see, returning our text is very much to the point. Running my happlyflow script shows that the call to this endpoint works perfectly, indicating that the server is running and is reachable.

A neat addition to this is that sending anything that is not a GET to this endpoint will result in a 405 Method not allowed, which is a standard HTTP status code. That is because we defined the keepalive endpoint to use the GET method.

Our implementation for the sentiment analysis will be a POST endpoint. This is because we do not want to transfer the list of comments using the URL by way of queryparams, without leaking our data to any log files or not have it enjoy the magic of TLS encryption when we implement that in the future.

Let’s define the route as such:

mux.HandleFunc("POST /api/is_positive", handler.AnalyseSentiment)

The function called here for this will be covered down below. But first, let’s look at…

Adding Middleware

Another fun feature of Flask, which I missed in this raw http.Server implementation, was a recording of my happyflow calls hitting their respective endpoint. Ofcourse, I would get test results, but it’s usefull to see the server application also logs this. To do this, I added middleware.

func LogRequests(next http.Handler) http.Handler {
	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		log.Printf("%s %s", r.Method, r.URL.Path)
		next.ServeHTTP(w, r)
	})
}

This nifty function is called everytime one of my endpoints is reached. When implementing this yourself, do add input validation or else you’re opening yourself up to log injection.

To make our server call the middleware, I define a new variable ‘handler’ and pass along mux to our LogRequests function. Then, we switch the ‘mux’ variable from our http.Server definition to this new handler variable.

handler := middleware.LogRequests(mux)

server := http.Server{
    Addr:    viper.GetString("server.addr"),
    Handler: handler,
}

This causes our middleware to be called before our routes are called, even if this leads to a 404 on an undefined route.

The sentiment engine

Now, the meat and potatoes.

At the time of writing, I found 2 packages via ’the search engine’ that offered a sentiment-analysis implementation. These being:

I’ve tried them both.

My first implmentation was with cdipaolo, but I ran into 2 issues during my implementation.

  1. The first was that my happyflow script, with real comments from scraping, always returned a negative or bad score for positive comments. This was a bit frustrating, but I figured it would’ve have been within my margin of error. The Flask implementation with NLTK also did not match 100% with what my human brain could tell was positive.

  2. The second issue was that I had a skill issue. When I refactored the code for the engine to be initialized once as a ‘global variable’, it caused the engine to always return 0.0 on classification. This did not happen when I declared and initialized the engine in the endpoint itself, but don’t do that. Creating a new sentiment engine for every call means that it takes up memory everytime it is called (until it is destroyed).

This was odd to me, and made me rewrite the sentiment analysis engine, causing me to try out drankou/go-vader. This implmentation gave my happyflow tests a better score, but eventually the 0.0 skill issue bug returned. This was a bit frustrating to me, but after fooling around it became clear to me why this happened.

First, I defined sentiment.go in it’s own Go package in the ’engine’ directory. In there, I used the following code to define a pointer to the object handling the sentiment analysis.

var SentimentEngine *vader.SentimentIntensityAnalyzer

func CreateEngine(lexiconMap string, emojiLexicon string) error {
	SentimentEngine = &vader.SentimentIntensityAnalyzer{}
	return SentimentEngine.Init(lexiconMap, emojiLexicon)
}

This code was called in my main function, passing it the lexiconMap and the emojiLexiconMap.

err = engine.CreateEngine("data/lexiconmap.txt", "data/lexiconemojimap.txt")
if err != nil {
    panic(fmt.Errorf("fatal error starting engine: %w", err))
}

We then can call this engine object in our endpoint. We decode the incomming JSON containing the commentList and parse it to our struct. The comment is then passed to the PolarityScores function. The result is then placed into the resultList which is encoded back to JSON and returned in the response.

func AnalyseSentiment(w http.ResponseWriter, r *http.Request) {

    var commentList model.CommentList
    var resultList []model.Result

    err := json.NewDecoder(r.Body).Decode(&commentList)
    if err != nil {
        http.Error(w, err.Error(), http.StatusBadRequest)
        return
    }

    for _, comment := range commentList.Comments {
        result := engine.SentimentEngine.PolarityScores(comment.Text)
        resultList = append(resultList, model.Result{
            Id:    comment.Id,
            Score: fmt.Sprintf("%f", result["compound"]),
        })
    }

    response := model.ResultList{
        Result: resultList,
    }

    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(response)
}

Running my happyflow returned no issues on this, even showing the 0.0 ‘bug’ to be solved. This made me quite happy.

However, I doubted if I should give cdipaolo with more comments a better try. But feeling happy with this result, I decided to keep the current implementation of drankou. The reason for this is that I was lazy and eager to try out a different aspect of writing a Golang application, which was…

Parsing configuration

For parsing a configuration file, I used sfp13/viper. Based on my ‘search engine’ results, this seemed the most mature and most used module for this sort of work. The implementation of it was quick and easy. I was a bit worried regarding how it would discover the configuration directory and file path, but in the end it was not an issue.

Setting it up was as easy as using the examples, modified to my directory structure.

viper.SetConfigName("config")  
viper.SetConfigType("yaml")    
viper.AddConfigPath("config/") 
err := viper.ReadInConfig()   

And calling it was also very much to the point.

server := http.Server{
    Addr:    viper.GetString("server.addr"),
    Handler: handler,
}

Conclusion so far

Rewriting it was a fun experience, but I think I will keep to Pythoon when starting a new project. Instead, I want to get more familiar with the language without worrying about what it is actually doing.

By that, I mean that I will instead rewrite a couple more older projects. That way, I know -what- I want it to do, I just have to worry about writing it the Go way.

It felt difficult knowing what the ‘industry standard’ or ‘best practices are’. Admittedly, I spent too much time worrying about writing ‘give me a job’-code, while I should instead write code that solves a problem (and write tests).

Contradicting that, I do have the itch to perfect a couple of things, such as writing tests, trying out ‘cdipaolo’ and showing metrics of how it is running in ‘production’. I will link a part 2 here if I have written the follow up.

References