Where All Went Wrong

Published on June 5, 2022 | Tags: django models datefield

Mistakes were made

So, I basically messed up my own website! (Fortunately, it was just that).

About a month ago from the writing of this post, I created a django app to track how many users were visiting each page. Something very simple for me to get an idea of how many people were looking at my website. This is the version I had when the incident happend:

class VisitPerPage(models.Model):
page = models.CharField(max_length=100)
date = models.DateField(auto_now=True)
view_count = models.BigIntegerField(default=1)
exists = models.BooleanField(default=True)

And then a middleware that basically checked the date, and the page and added 1 to the count of visits or created the model if it did not exist yet. So far, so good.

However, due to canonical trailing slash in django (which is what I prefer), sometimes, some of the pages visited were, for example /til and /til/ which was a problem for obvious reasons plus, given one redirects to the other, it would probably mean that there were some double counting going on.

So my easy approach to solve this was to normalize the path, so that they were all consistent. The idea was to remove the trailing slash if it was there and do nothing otherwise. Simple, clean, worked fine.

And here is were all went wrong.

In order to have all the data normalised in the database, I quickly opened the django shell which loads the django configuration and sets up everything so you can make use of the django querying system through the ORM. And wrote this three lines:

from analytics.middleware import normalize_request_path
data = VisitPerPage.objects.all()

for row in data:
row.page = normalize_request_path(row.page)

pressed enter and job well done! I felt very smug about my quick normalisation of the analytics. To moments later find out that I screwed up all the two weeks worth of analytics. How? Because of this line:

date = models.DateField(auto_now=True)

This does not what I think it does. This saves the date of when this model is saved. And not when it was created. So, now, I had a bunch of rows in my database with all the same day, may 29th.

Fortunately, it was not that many data, and scraping all that data was not that big of a deal so I decided to scrape it all up, scrap the database entirely and take this time to make some refactoring on the models and whatnot. If only I used the correct kwarg for it to do what I wanted it to do (but honestly I never thought I would modify this data myself).

date = models.DateField(auto_now_add=True)


I guess now I will not forget which is the appropriate argument for the DateField any more or, at least, I will think twice before committing.

Ferran Jovell