Controlling the repository language on Github

Background

In a previous post I annouced the release of version 1.0 of my Haskell application chefkoch on github.

The only problem I had: gihub thought it’s HTML, or at least that was the language displayed in the repository overview.

Solution

Github uses the linguist library to determine the main language of a repository. I don’t know the internalas, but I guess in case of more than one programming language being present in the repository the one with more lines of code wins. Since I uploaded a few exemplary HTML files, this is probably the reason why my repository was considered HTML.

Github provides a solution via a hidden file called .gitattributes. This file is the place for many useful options that can be used to alter the behaviour of github. One possiblity is to list directories which are not considered by linguist in the process of determining the repo’s language.

Note: This is not the same as ignoring the files via .gitignore, because we definitely want to have these files in our repository!

Since I keep the HTML files in the directory resources, this one-liner was enough to solve my problem:

# .gitattributes
resources/* linguist-vendored

Credits

I found this solution on stackoverflow and this blog post.