newspaint

Documenting Problems That Were Difficult To Find The Answer To

Monthly Archives: Apr 2012

Getting Seconds Past Epoch in GMT When All You Have Is mktime()

The Problem

I am using MinGW to create a C++ program. I want to take a string date and turn it into seconds past 1 Jan 1970 (standard Unix time) in GMT.

Unfortunately the mktime() function, when returning seconds past 1 Jan 1970 GMT, interprets the date in the tm structure as localtime.

So how do I use the mktime() function to generate the seconds past 1 Jan 1970 GMT by interpreting the contents of the tm structure as GMT time? Well I’ve written a function to do this. I’ve written a few functions, actually, but let’s start with a basic function.

The Solution

Essentially, once we know the seconds past the epoch for the start of a month, it is trivial to calculate the seconds for any day, hour, month, and second within that month. It is not necessary to concern ourselves with changes to daylight saving’s time because GMT does not move throughout the year (while localtime does for many timezones).

So we create a function to look up the seconds past 1 Jan 1970 for the GMT of the given year and month.

#include <time.h>
#include <string.h>

/*
 * getgmtmonthtime() - return GMT seconds past 1 Jan 1970
 *
 * year - e.g. 2012
 * month - 1..12 where January is 1 and December is 12
 */
time_t getgmtmonthtime( int year, int month ) {
  struct tm tcopy; /* mktime() modifies tm_hour field */
  struct tm tdata;

  memset( &tdata, 0, sizeof(tdata) );
  tdata.tm_year = year - 1900;
  tdata.tm_mon = month - 1;
  tdata.tm_mday = 1;
  memcpy( &tcopy, &tdata, sizeof(tdata) );

  /* converts localtime to GMT seconds */
  time_t epoch = mktime( &tcopy );

  /* convert GMT seconds to GMT time */
  struct tm *ptime = gmtime( &epoch );

  time_t offset = 0;
  if ( ptime->tm_mon == tdata.tm_mon ) {
    /* localtime (ptime) is ahead of GMT (tdata) */
    offset += ( ptime->tm_hour - tdata.tm_hour ) * 3600;
    offset += ( ptime->tm_min - tdata.tm_min ) * 60;
  } else {
    /* localtime (ptime) is behind GMT (tdata) */
    offset += ( ( ptime->tm_hour - 24 ) - tdata.tm_hour ) * 3600;
    offset += ( ptime->tm_min - tdata.tm_min ) * 60;
  }

  /* return the localtime seconds offset by the difference to GMT */
  return( epoch - offset );
}

How does this work? Well after asking the Operating System to interpret our GMT time as localtime and convert it into the time in GMT. We then take the broken-down representation of that GMT time – and see what the difference in hours and minutes is (remember that some places, such as South Australia, have an offset of half an hour so we must take into account minutes as well).

Note that this function is not thread-safe because the gmtime() and localtime() functions are not thread-safe. You could use the thread-safe gmtime_r() and localtime_r() functions where your operating system supports them but this may not be a portable solution.

Now that we know when the month begins we can write a function to interpret the year, month, day, hour, minute, and second as GMT:

/*
 * getgmtime() - return GMT seconds past 1 Jan 1970
 *
 * year - e.g. 2012
 * month - 1..12 where January is 1 and December is 12
 * day - 1..28/29/30/31
 * hour - 0..23
 * minute - 0..59
 * second - 0..59
 */
time_t getgmtime(
  int year, int month, int day,
  int hour, int minute, int second
) {
  time_t monthseconds = getgmtmonthtime( year, month );

  monthseconds += (
    ( ( day - 1 ) * 24 * 3600 ) +
    ( hour * 3600 ) +
    ( minute * 60 ) +
    second
  );

  return( monthseconds );
}

Caching

What if you’re reading in a file of dates and there are often a series of dates with the same year and month in a row? How about caching the Operating System time lookup for speed?

You can approach caching in a lot of different ways. This is just a single suggestion. We can hold a very basic array of, say, 5 months’ values. Any time we have a cache hit we move that value to element zero (the first in the array). Whenever we miss the cache we copy the calculated year/month value to the last element in the list. With such a small list an array scan is probably the fastest lookup approach – much quicker than, say, a map, which we’d consider if we were caching a large number of values.

So this is a rewrite of the getgmtime() function using a global cache of 5 values.

struct cachemonthgmt {
  int yearandmonth;
  time_t epoch;
} cachemthgmarr[ 5 ];
int cachemthgmarrsz = ( sizeof(cachemthgmarr) / sizeof(cachemthgmarr[0]) );

time_t getgmttime(
  int year, int month, int day,
  int hour, int minute, int second
) {
  /* create a lookup key (*100 is arbitrary, could have used *12) */
  int yearandmonth = year * 100 + month;

  time_t monthseconds = 0;

  /* scan array for lookup key */
  int cacheidx;
  for ( cacheidx = 0; cacheidx < cachemthgmarrsz; cacheidx++ ) {
    if ( cachemthgmarr[cacheidx].yearandmonth == yearandmonth ) {
      /* cache hit! */
      monthseconds = cachemthgmarr[cacheidx].epoch;

      /* move into top position if not already */
      if ( cacheidx > 0 ) {
        struct cachemonthgmt temp = cachemthgmarr[cacheidx];
        memmove(
          &cachemthgmarr[1],
          &cachemthgmarr[0],
          sizeof(struct cachemonthgmt) * cacheidx
        );
        cachemthgmarr[0] = temp;
      }
      break;
    }
  }

  if ( ! monthseconds ) {
    /* cache miss - store lookup into lowest cache store */
    monthseconds = getgmtmonthtime( year, month );
    cachemthgmarr[ cachemthgmarrsz - 1 ].yearandmonth = yearandmonth;
    cachemthgmarr[ cachemthgmarrsz - 1 ].epoch = monthseconds;
  }

  return(
    monthseconds +
    ( ( ((day-1) * 24 ) + hour ) * 3600 ) +
    ( minute * 60 ) +
    second
  );
}

Sure, it is a lot of extra code, but it will significantly improve the speed of date lookups. When I profiled the two functions in a piece of my code this cached version was more than twice as fast than the first function I presented. However even the slow version only took about 5% of the processing time of the entire utility (insignificant).

Testing

This was tested by writing a very simple loop as a main() function:

int main( void ) {
    int month;
    for ( month = 1; month < 12; month++ ) {
        printf(
            "%04d-%02d-%02d %02d:%02d:%02d = %d\n",
            2012, month, 1, 12, 00, 00,
            getgmttime( 2012, month, 1, 12, 00, 00 )
        );
    }
    return( 0 );
}

Then this was executed in different timezones by modifying the TZ variable. E.g.:

user@system:/tmp$ TZ=/usr/share/zoneinfo/Pacific/Auckland date
Tue Oct 30 03:53:32 NZDT 2012
user@system:/tmp$ TZ=/usr/share/zoneinfo/Pacific/Auckland ./test_months
2012-01-01 12:00:00 = 1325419200
2012-02-01 12:00:00 = 1328097600
2012-03-01 12:00:00 = 1330603200
2012-04-01 12:00:00 = 1333281600
2012-05-01 12:00:00 = 1335873600
2012-06-01 12:00:00 = 1338552000
2012-07-01 12:00:00 = 1341144000
2012-08-01 12:00:00 = 1343822400
2012-09-01 12:00:00 = 1346500800
2012-10-01 12:00:00 = 1349092800
2012-11-01 12:00:00 = 1351771200

user@system:/tmp$ TZ=/usr/share/zoneinfo/US/Eastern date
Mon Oct 29 10:54:41 EDT 2012
user@system:/tmp$ TZ=/usr/share/zoneinfo/US/Eastern ./test_months
2012-01-01 12:00:00 = 1325419200
2012-02-01 12:00:00 = 1328097600
2012-03-01 12:00:00 = 1330603200
2012-04-01 12:00:00 = 1333281600
2012-05-01 12:00:00 = 1335873600
2012-06-01 12:00:00 = 1338552000
2012-07-01 12:00:00 = 1341144000
2012-08-01 12:00:00 = 1343822400
2012-09-01 12:00:00 = 1346500800
2012-10-01 12:00:00 = 1349092800
2012-11-01 12:00:00 = 1351771200

The results are the same (which is what we’d expect – because the purpose of this function is to get the GMT time and should be unaffected by the local timezone). New Zealand was chosen as the east-of-GMT zone because, during summer, it is 13 hours ahead of GMT. A random USA timezone was chosen as west-of-GMT. Thus both paths of the function (which calculate localtime offset from GMT) were exercised.

Finally a memory leak check was performed on the routine.

user@system:/tmp$ valgrind --track-origins=yes --leak-check=full ./test_months
==28462== Memcheck, a memory error detector
==28462== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==28462== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==28462== Command: ./test_months
==28462== 
2012-01-01 12:00:00 = 1325419200
2012-02-01 12:00:00 = 1328097600
2012-03-01 12:00:00 = 1330603200
2012-04-01 12:00:00 = 1333281600
2012-05-01 12:00:00 = 1335873600
2012-06-01 12:00:00 = 1338552000
2012-07-01 12:00:00 = 1341144000
2012-08-01 12:00:00 = 1343822400
2012-09-01 12:00:00 = 1346500800
2012-10-01 12:00:00 = 1349092800
2012-11-01 12:00:00 = 1351771200
==28462== 
==28462== HEAP SUMMARY:
==28462==     in use at exit: 0 bytes in 0 blocks
==28462==   total heap usage: 17 allocs, 17 frees, 3,183 bytes allocated
==28462== 
==28462== All heap blocks were freed -- no leaks are possible
==28462== 
==28462== For counts of detected and suppressed errors, rerun with: -v
==28462== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

No detected memory leaks in this function!