# Coding Challenge: Increment a date/timestamp string

## Posted 2 Nov 2004 at 06:21 UTC by ncm

I encountered a funny problem. I have a file name containing a 14-digit date/timestamp string, like "20041102235959". My code has no interest in what time it indicates, but just needs to transform it into another value that represents *one second later*. The challenge: write an elegant function to increment a textual timestamp.

You might think, "I can do that in five lines with `sscanf`, `mktime`, and `sprintf`!" And you should think that,
although you'd find the lines longish.

But consider the string as a pure 14-digit number, with digits that
have very odd add-with-carry relationships. What does the code look
like that adds one to such a number without depending on any external
library? Such a function could be a *lot* faster than calling
`mktime` etc., and (more importantly) a lot more fun to code.
Curiously, the ASCII (actually, BCD) representation is ideal for
leap-year identification, demanding only bitwise operations.

Assume the input string represents a valid 14-digit date/time value,
null-terminated, that may be overwritten. I'll post a link to my
most elegant (33-line) solution in a few days. Extra credit for
skipping over optional non-digit punctuation, e.g. "2004/11/02 23:59:59".

000000 IDENTIFICATION DIVISION.
000000 PROGRAM-ID. KS-BUMP.
000000 AUTHOR. RINGBARK.
000000 ENVIRONMENT DIVISION.
000000 CONFIGURATION SECTION.
000000 SOURCE-COMPUTER. NCR-IMOS.
000000 OBJECT-COMPUTER. NCR-IMOS.
000000* it's a long time since I last wrote the first three divisions
000000* so please excuse me if I'm not quite right with them
000000 DATA DIVISION.
000000 LINKAGE SECTION.
000000 01 WS-DATE-TIME.
000000 03 WS-YEAR PIC 9999.
000000 03 WS-YEAR-RDF REDEFINES WS-YEAR.
000000 05 WS-CC PIC 99.
000000 88 WS-CENTURY-LEAP-YEAR VALUE 20, 24, 28, 32, 36, 40, 44, 48,
000000 52, 56, 60, 64, 68, 72, 76, 80,
000000 84, 88, 92, 96.
000000* ignores century years before 2000 deliberately
000000 05 WS-YY PIC 99.
000000 88 WS-LEAP YEAR VALUE 04, 08, 12, 16, 20,
000000 24, 28, 32, 36, 40,
000000 44, 48, 52, 56, 60,
000000 64, 68, 72, 76, 80,
000000 84, 88, 92, 96.
000000 88 WS-CENTURY VALUE 00.
000000 03 WS-MONTH PIC 99.
000000 88 WS-30-DAY VALUE 09, 04, 06, 11.
000000 88 WS-31-DAY VALUE 01, 03, 05, 07, 08, 10, 12.
000000 88 WS-FEBRUARY VALUE 02.
000000 03 WS-DAY PIC 99.
000000 03 WS-HOUR PIC 99.
000000 03 WS-MINUTE PIC 99.
000000 03 WS-SECOND PIC 99.
000000 PROCEDURE DIVISION USING WS-DATE-TIME.
000000 MAIN SECTION.
000000 BUMP-TIME.
000000 ADD 1 TO WS-SECOND.
000000 IF WS-SECOND NOT EQUAL TO 60
000000 GO TO BUMP-TIME-X.
000000 MOVE ZEROES TO WS-SECOND.
000000 ADD 1 TO WS-MINUTE.
000000 IF WS-MINUTE NOT EQUAL TO 60
000000 GO TO BUMP-TIME-X.
000000 MOVE ZEROES TO WS-MINUTE.
000000 ADD 1 TO WS-HOUR.
000000 IF WS-HOUR NOT EQUAL TO 24
000000 GO TO BUMP-TIME-X.
000000 MOVE ZEROES TO WS-HOUR.
000000 ADD 1 TO WS-DAY.
000000 IF WS-DAY LESS THAN 29
000000 GO TO BUMP-TIME-X.
000000* preceding condition to skip next ugly condition most of the time
000000 IF ( WS-31-DAY AND WS-DAY = 32 )
000000 OR ( WS-30-DAY AND WS-DAY = 31 )
000000 OR ( WS-FEBRUARY AND
000000 WS-LEAP-YEAR AND WS-DAY = 30 )
000000 OR ( WS-FEBRUARY AND
000000 WS-CENTURY AND WS-CENTURY-LEAP-YEAR AND WS-DAY = 30 )
000000 OR ( WS-FEBRUARY AND
000000 NOT ( WS-LEAP-YEAR OR WS-CENTURY ) AND WS-DAY = 29 )
000000 MOVE 01 TO WS-DAY
000000 ADD 1 TO WS-MONTH.
000000 IF WS-MONTH EQUAL TO 13
000000 MOVE 01 TO WS-MONTH
000000 ADD 1 TO WS-YEAR.
000000* fails after 9999, but field not big enough to handle Y10K problem anyhow.
000000 BUMP-TIME-X.
000000 EXIT PROGRAM.

**You're right.**, posted 2 Nov 2004 at 13:54 UTC by ncm »
(Master)

Thanks for the reminder why we all (still) hate COBOL, and for the most graceful possible example of the opposite of elegant.

**A sample case**, posted 2 Nov 2004 at 16:25 UTC by nymia »
(Master)

The algorithm is shown below.

1 11 11 1
2004 11 01 11 59 59
+ 1
========================
2004 11 02 00 00 00

**Bad sample**, posted 2 Nov 2004 at 16:51 UTC by ncm »
(Master)

For the record, nymia's sample has the hours wrong: they roll over after 23, not 11. Consider, rather:

1 1 11 11 1
2000 02 29 23 59 59
+ 1
========================
2000 03 01 00 00 00

Be sure to get the centennial and quadricentennial Februaries right!

**Elegance**, posted 3 Nov 2004 at 08:51 UTC by ncm »
(Master)

I'm starting to get solutions in the mail. The question came up: what do I mean by elegant?

Every elegant solution iterates or recurses. Recursive solutions are *ever so slightly* more elegant that iterative ones. Solutions that accommodate punctuation, and reproduce it in the result, are more elegant. Solutions that operate per-character, rather than stepping by twos, are more elegant. Solutions that avoid multiplications and divisions are more elegant. Solutions that avoid conversions between ASCII (or EBCDIC, as the case may be) and binary are more elegant. Solutions that work the same on ASCII and EBCDIC native codesets are more elegant. Solutions that use bitwise operators rather than sequences of comparisons are more elegant. Solutions that check for a leap year only when necessary are more elegant. Solutions that use lookup tables only for truly irregular specifications are more elegant.

I've been refining my solutions; they still in no wise resemble anything else I've seen. The iterative one is 29 lines. The recursive one is also 29 lines, and while it's a bit slower, it accommodates punctuation properly, and is nicer to look at.

date +%s -d now+1second; date +%s

you should check the source of date for details.

**Some solutions**, posted 3 Nov 2004 at 22:13 UTC by ncm »
(Master)

I've got a few solutions via e-mail. In the interest of stirring up some inspiration, here they are, in order of receipt. Can they be improved upon?

Yoann Aubineau --
(yaubi)

Colin Leroy --
(colinleroy)

Mauro Persano --
(mpr)

They are slightly altered for presentation; edits welcome. I'll save mine out for a couple more days. Some good strings to test with are

19991231235959
20000228235959
20000229235959
20040228235959
21000228235959
19000228235959
16000228235959

that's 31-lines, with skipping over non-digits, and three spacing lines that can be removed to make it a 28-liner.

That made it for a good half-hour fun, but this kind of code tends to be completely unmaintainable. I hope you don't rely on such things for production :-)

void bump_timestamp( char* stamp )
{
int maxs[14] = { 9, 5, 9, 5, 9, 2, 9, 3, 9, 1, 9, 9, 9, 9 };
int vals[14], pos[14], nn, mm, v, month;

for ( nn = 0, mm = 0; nn < 14; nn++ ) {
for ( ; (unsigned int)(v = (stamp[mm]-'0')) >= 10; mm++ );
pos[13-nn] = mm++;
vals[13-nn] = v;
}

if ( vals[5] == 2 ) maxs[4] = 3; /* adjust hour max */
if ( vals[9] == 1 ) maxs[8] = 2; /* adjust month max */
if ( (month=(vals[8]+vals[9]*10)) == 2 ) { /* handle february */
if ( vals[7] == 2 ) {
maxs[6] = 8 + ( ((vals[10]+2*vals[11])&3)==0 && ( vals[10]!=0 || vals[11] != 0 || ((vals[12]+2*vals[13])&3)==0 ) );
maxs[7] = 2;
}
}
else if ( vals[7] == 3 ) { /* handle other months */
maxs[6] = (0x2AD5 >> (month-1)) & 1;
}

for ( nn = 0; nn < 14; nn++ ) {
if ( vals[nn] < maxs[nn] ) {
stamp[pos[nn]] += 1;
break;
}
stamp[pos[nn]] = '0' + (((1 << nn) & 320) != 0 && vals[nn+1] == maxs[nn+1]);
}
}

The final loop can be written as

for ( nn = 0; nn < 14 && vals[nn] >= maxs[nn]; nn++ )
stamp[pos[nn]] = '0' + (((1 << n) & 320) != 0 && vals[nn+1] == maxs[nn+1]);
stamp[pos[nn]] += 1;

Which reduces the program by 4 lines. We're now at 24 lines, and something really, really, difficult to understand without aspirin :-)

Shame on me for spending time on this thing

**Ugh**, posted 4 Nov 2004 at 16:06 UTC by mpr »
(Journeyer)

Ugh. ncm posted my entry. For the record: I wouldn't call it elegant - "fugly" would be more appropriate. I just tried to follow some of ncm's elegance guidelines (recursive, one digit at a time, and so on). Dang, now I got to improve it.

I have not written the code yet, but I would imagine the hour segment will be coded using the logic circuit shown below.

C = Carry

A,B = Addends

R = Sum

R' = Exception

R0 = A0 + B0 + C0

R1 = A1 +B1 + C1

R0' = R0 - 4 if R1=2

R0' = R0 if R1 in [0,1]

R1' = 0 if R1 = 2

R1' = R1 if R1 in [0,1]

A1 B1 C1 A0 B0 C0
o o o- o o o
| | | \ | | |
\ | / \ \ | /
===== \ =====
| | \ | |
| | ----+ |
C2 .R1 .R0
.......... .
. . .
. o o
..... | |
. \ /
. ===
. |
. |
. .
. .............
. . .
o o .
| | .
\ / .
=== .
| .
|R1' .R0'
| .

**ASCII Art**, posted 5 Nov 2004 at 00:56 UTC by ncm »
(Master)

nymia: please don't feel obliged to post the diagram for the day-of-month/leap-year part. :-)

By the way, I didn't say so before, but these times are GMT, so there's no Summer Time adjustments to worry about. Entrants so far seem to have figured that out for themselves.

Oops, my latest optimization didn't handle correctly the case where the year did overflow (i.e. 9999 => 0000). Here's a new version of the code where I did get rid of the multiplication by 10. That's 27 lines with spacers, or 22 without them.

This is getting meaningless

void bump_timestamp( char* stamp )
{
int nn, mm, v, vals[14], pos[14], maxs[14] = {9,5,9,5,9,2,9,3,9,1,9,9,9,9};
for ( nn = 0, mm = 0; nn < 14; nn++ ) {
for ( ; (unsigned int)(v = (stamp[mm]-'0')) >= 10; mm++ );
pos[13-nn] = mm++;
vals[13-nn] = v;
}
if ( vals[5] == 2 ) maxs[4] = 3; /* adjust hour max */
if ( vals[9] == 1 ) maxs[8] = 2; /* adjust month max */
if ( vals[8]==2 && vals[9]==0 ) { /* handle february */
if ( vals[7] == 2 ) {
maxs[6] = 8 + ( ((vals[10]+2*vals[11])&3)==0 && ( vals[10]!=0 || vals[11] != 0 || ((vals[12]+2*vals[13])&3)==0 ) );
maxs[7] = 2;
}
} else if ( vals[7] == 3 ) { /* handle other months */
maxs[6] = ((0x29AA >> vals[8])^vals[9]) & 1;
}
for ( nn = 0; nn < 14 && vals[nn] >= maxs[nn]; nn++ )
stamp[pos[nn]] = '0' + (((1 << nn) & 320) != 0 && vals[nn+1] == maxs[nn+1]);
if ( nn < 14 ) stamp[pos[nn]] += 1;
}

**Posting solutions**, posted 5 Nov 2004 at 15:00 UTC by ncm »
(Master)

Probably any more solutions posted should be links to diary entries. (The ">>" link at the top of the entry has the URL.) You can avoid cluttering the "recent diaries" page by immediately putting in another diary entry. Besides keeping this page clean, it means you can edit your entry after posting, as you get new ideas.

It's fun to see people rediscovering ideas I had, and also to see ideas I never would have thought of. I particularly like the "if ((1 << n) & mask)" trick, which I've used in other contexts but didn't think of here.
Literal masks should be hex, though, like "0x140" instead of "320" -- this isn't an obfuscation contest! Clarity is elegant too. That means, too, that extra-long lines count against you, and blank lines don't.

Don't be shy about stealing from posted solutions. That's how progress happens, by combining good ideas. The field is far from mined out -- I can see major improvements possible in everything posted so far. My hope is to evolve a "most elegant possible" solution, at the end, with subcategories for iterative, recursive, and "other".

One more point of elegance: tables that are static const are better than writable tables patched during the run.

**wow**, posted 7 Nov 2004 at 21:01 UTC by cmm »
(Journeyer)

it is quite surprising to see how many people can *increment* a *string*.

**My best so far...**, posted 9 Nov 2004 at 06:25 UTC by ncm »
(Master)

I have three.

First, here is an iterative version, 22 lines.

Second, here is a recursive version, 21 lines.
I like the three cases of indexing into a (different) constant literal
string. This one handles one byte of punctuation per digit-pair, but it's a lot slower than the others.

Third, here is an unrolled
version, 24 lines. By the rules it's not elegant, but like ringbark's entry, it might be called graceful, if
a bit — as Mauro remarked — evil.
If you were inclined, like nymia, to implement
it in hardware, you might start with this one. (If you prefer
to see shorter statements, consider this one instead, 27 lines.)

To notice: In `month = (stamp[4] * 10 + stamp[5]) & 0xf;`
these are character values being operated on, and the "& 0xf" at
the end cleans up. The iterative and recursive versions clean up the
month and day units-digits after the fact, but from opposite
directions. All avoid running the complicated tests on digit 7
if it happens to be a '9'.

Can you improve on any of these?

**Improvement**, posted 9 Nov 2004 at 09:50 UTC by freetype »
(Master)

First, there is a bug in your code, which incorrectly computes 2010 as a leap year. You should replace the `(stamp[3] & 3) != 0` with something like `(((stamp[2] << 1)+stamp[3]) & 3) != 0)` instead. Of course, this will probably mean one more line to each one your routines ;-)

Seeing various solutions posted here, including your own, I suppose you were kidding when you said:

*
this isn't an obfuscation contest! Clarity is elegant too
*

Any contest which values minimal line counts will naturally pick best solutions among the most obfuscated and unmaintainable programs.

And now, here's a 19-lines recursive routine which is capable of skipping over punctuation (remove one line if you don't need to support that). It also doesn't use any multiplication. I suppose it's possible to do better, but I'll let this as an exercise to the reader ;-)

int bump( char* s, int n = 0, char* y = NULL, char* m = NULL )
{
int max = "999919392959590"[n], result = (n >= 14);
while ( ((1 << n) & 0x1551) && *s && (unsigned)(*s-'0') >= 10 ) s++;
if ( n < 14 && *s && bump( s+1, n+1, (n==0) ? s : y, (n==4) ? s : m ) ) {
max -= (6 & -( n==9 && s[-1]=='2' )) + (7 & -( n==5 && s[-1]=='1' ));
if ( n==7 ) {
if ( s[-1]=='2' && m[0]=='0' && m[1]=='2' ) {
max = '8' + ( ((y[3]+(y[2]<<1))&3)==0 && ( (y[2]|y[3])!='0' || \
((y[1]+(y[0]<<1))&3)==0 ) );
if ( *s >= max ) s[-1] = '3';
}
else if ( s[-1]=='3' ) max = '0'+ (((0x29AA >> (m[1]-'0'))^m[0]) & 1);
}
result = (*s >= max);
*s = !result ? *s+1 : '0'+(((1 << n)&0xA0) != 0);
}
return result;
}

**PHP Solution**, posted 9 Nov 2004 at 14:28 UTC by prozac »
(Journeyer)

// timeinc - increment a time/date stamp by N seconds
function timeinc($time, $n) {

return $time + $n;
}

This works.

One can verify it by use of mktime() to create two times one second apart, and then get the difference between them for the interval between two seconds. It just so happens that it is 1 in PHP.

Prozac, I think you misunderstood the problem. The PHP "timestamp" is not what we're trying to increment here. The current competition's input is a human-readable formatted date, not the number of seconds since the Unix epoch. The crux of the problem is to write a routine that can modify the human-readbla estring directly, without using helper routines

Of course, this is totally useless, or I wouldn't dare posting there :-)

Here is a concise solution in Python:

def dateinc(s) :
def carry(list, modulus) :
if len(list) == 1 : return (list[0],)
else : q = carry(list[1:], modulus[1:]); return (list[0] + q[0] // modulus[1],) + q
s = reduce(lambda x,y:x+y, map(lambda x : (x >= '0' and x <= '9') * x + '', s))
result = map(lambda x,y:x+y, map(int,(s[0:4],s[4:6],s[6:8],s[8:10],s[10:12],s[12:14])),(0,-1,-1,0,0,1))
modulus = [10000,12,[31,28,31,30,31,30,31,31,30,31,30,31][result[1]],24,60,60]
if result[0] % 4 == 0 and (result[0] % 100 != 0 or result[0] % 400 == 0) :
modulus[2] += 1 # Account for leap day.
result = carry(result, modulus) # Carry the extra second.
result = map(lambda x,y:x%y, result, modulus) # Reduce each element.
result[1:3] = map(lambda x:x+1, result[1:3]) # Readjust to 1-base.
return reduce(lambda x,y:x+y, map(lambda x : (x < 10) * '0' + str(x), result))

print dateinc('1999/12/31 23:59:59')
print dateinc('20000228235959')
print dateinc('20000229235959')
print dateinc('20040228235959')
print dateinc('21000228235959')
print dateinc('19000228235959')
print dateinc('16000228235959')

Save one line by propagating the carry bit:

def dateinc(s) :
def carry(list, modulus) :
if len(list) == 1 : return [list[0] // modulus[0], list[0] % modulus[0]]
else : q = carry(list[1:], modulus[1:]); return [(list[0] + q[0]) // modulus[0], (list[0] + q[0]) % modulus[0]] + q[1:]
s = reduce(lambda x,y:x+y, map(lambda x : (x >= '0' and x <= '9') * x, s))
result = map(lambda x,y:x+y, map(int,(s[0:4],s[4:6],s[6:8],s[8:10],s[10:12],s[12:14])),(0,-1,-1,0,0,1))
modulus = [10000,12,[31,28,31,30,31,30,31,31,30,31,30,31][result[1]],24,60,60]
if result[0] % 4 == 0 and (result[0] % 100 != 0 or result[0] % 400 == 0) :
modulus[2] += 1 # Account for leap day.
result = carry(result, modulus)[1:] # Carry the extra second.
result[1:3] = map(lambda x:x+1, result[1:3]) # Readjust to 1-base.
return reduce(lambda x,y:x+y, map(lambda x : (x < 10) * '0' + str(x), result))

It's possible to make it a 9-liner :-) Note that the routine is slightly broken because it destroys the punctuation, it doesn't just skip over it.

def dateinc(s) :
s = reduce(lambda x,y:x+y, map(lambda x : (x >= '0' and x <= '9') * x + '', s))
result = map(lambda x,y:x+y, map(int,(s[0:4],s[4:6],s[6:8],s[8:10],s[10:12],s[12:14])),(0,-1,-1,0,0,1))
modulus = [10000,12,[31,28,31,30,31,30,31,31,30,31,30,31][result[1]],24,60,60]
modulus[2] += (result[0] % 4 == 0 and (result[0] % 100 != 0 or result[0] % 400 == 0))
for x in range(len(result)-1): result[-x-2] += (result[-x-1] >= modulus[-x-1])
result = map(lambda x,y:x%y, result, modulus) # Reduce each element.
result[1:3] = map(lambda x:x+1, result[1:3]) # Readjust to 1-base.
return reduce(lambda x,y:x+y, map(lambda x : (x < 10) * '0' + str(x), result))

The following uses builtins instead of one-off lambda functions, wherever possible. I've also reworked the `carry` function into a one-liner that computes the carry in-place.

def dateinc(s) :
s = reduce(lambda x,y:x.isdigit()*x+y.isdigit()*y, s)
ans = map(int.__add__, map(int,(s[0:4],s[4:6],s[6:8],s[8:10],s[10:12],s[12:14])),(0,-1,-1,0,0,1))
mod = [10000,12,[31,28,31,30,31,30,31,31,30,31,30,31][ans[1]],24,60,60]
if ans[0] % 4 == 0 and (ans[0] % 100 != 0 or ans[0] % 400 == 0) : mod[2] += 1
ans = map(lambda x:(ans[x] + reduce(int.__mul__, map(lambda y:ans[y]>=mod[y]-1,range(x+1,len(ans))),x!=len(ans)-1)) % mod[x], range(len(ans)))
ans[1:3] = map(int.__add__,ans[1:3],[1,1]) # Readjust to 1-base.
return reduce(str.__add__, map(lambda x : (x < 10) * '0' + str(x), ans))

Thanks, freetype! Looks like we crossed signals. :) Thanks for pointing out the punctuation problem; I'd misread the original specification. It shouldn't be difficult to fixup the result using the original string as a model, but I won't take the time to work that out now.

It'a a fine point whether `("0000010100000"[i])` is better than
`('0' + ((0xA0 >> i) & 1))`. The first involves an access to
memory that is probably in L1 cache; the second just needs a few ALU
operations. On a P4, however, a shift `(C >> i)` takes i
cycles because Intel omitted the barrel shifter from the P4.
(Evidently they couldn't make it work in just one cycle.) Most other modern CPUs have them, but many minimal ones don't. Given the ambiguity, the indexed solution seems clearer.

I don't think a Python (or Perl, or PHP, or VB) solution can
really qualify here. Any of such code necessarily depends on all
kinds of library support, so that probably most of the instructions
executed during the call cannot be traced to anything in the source
text. (That's not to say that these examples won't serve equally
well as evidence of too much free time.)

Thanks to freetype for an excellent entry, and for the correction. Wouldn't you consider, though,

if (s[-1] == "xxxxx1xxx2xxxx"[n]) { max -= 8 - s[-1]; }

clearer than

max -= (6 & -( n==9 && s[-1]=='2' )) + (7 & -( n==5 && s[-1]=='1' ));

? I freely admit that the

`(8 - s[i])` relationship is appallingly arbitrary. In a similar vein, to say

`('0' + (((0x29AA >> (m[1]-'0'))^m[0]) & 1))` was fiendishly
clever (0x1AA would have sufficed) but is it better than

`("0101010110"[m[0] + m[1] & 0xf])` or (in the other direction)

`('0' + (((m[0] >> 3) + m[0] + m[1]) & 1))`?

**correction**, posted 10 Nov 2004 at 21:37 UTC by ncm »
(Master)

Sorry, that should have been `max -= ('8' - s[-1])` or, equivalently,

if (s[-1] == "xxxxx1xxx2xxxx"[n]) { max = s[-1] + 1; }

Here's a way that destructively does it. Add a malloc for non-destructive. Thread safty could be improved. It is elegant in that it gets libc to do the work for you. This code also assumes UTC or some other time that doesn't have DST. It ignores leap seconds completely. Most of the other solutions posted here likewise ignore these complications. It won't work accross calender conversion dates either (eg, when the UK adopted the Gregoran calender over the earlier Julian calender). Or any other unknown discontinuity in time. Using mktime and localtime could overcome some of these problems if that's what you are trying to solve.

char *
incr_time(char *s)
{
struct tm tm;
time_t t;
strptime(s, "%Y%m%d%H%M%S", &tm);
t = timegm(&tm) + 1;
tm = *gmtime(&t);
strftime(s, "%Y%m%d%H%M%S", &tm);

return s;
}

imp: The original article reads, in part, "You might think, 'I can do that in five lines with sscanf, mktime, and sprintf!' ... But ... What does the code look like that [works]...

*without depending on any external library*?" We frown on use of

*division* because on many CPUs it's a subroutine call.

Why is this concept so difficult?

**Further adventures**, posted 12 Nov 2004 at 22:10 UTC by ncm »
(Master)

Using what we've learned, this one is 18 lines, iterates,
handles punctuation, operates byte-by-byte, eschews division, multiplication and BCD<->binary conversions,
uses only constant literal arrays (except the argument),
rolls year 9999 -> 0000, has helpful variable names and
formatting, compiles with "gcc-3.4 -Os" to 114 amd64 instructions,
and is almost, but not quite, completely useless.

void increment_datetime(char* date)
{
char digit = 13, *cur, *mo = date + 4 + ((unsigned)(date[4] - '0') > 9);
for (cur = date + 14; *cur != '\0'; ++cur) {}
do {
int skip = (((unsigned)(*--cur - '0') > 9) && --cur);
if (*cur !=
((digit > 4 && cur[-1] == "xxxxx1xxx2xxxx"[digit]) ? cur[-1] + 1
: ((digit == 6 && (mo[0] | mo[1]) == '2') ? '2'
: ((digit == 7) ? ((cur[-1] == '3') ? "0101010110"[(mo[0] + mo[1]) & 0xf]
: '9' - ( cur[-1] == '2' && (mo[0] | mo[1]) == '2' &&
(((date[2] << 1) ^ date[3]) & 3) || ((date[2] | date[3]) == '0' &&
(((date[0] << 1) ^ date[1]) & 3))))
: "99991939295959"[digit] )))) { ++*cur; break; }
*cur = '0';
if ((digit | 2) == 6) { cur[1 + skip] = '1'; }
} while (digit-- > 0);
}

I'm sure there's still plenty of room for improvement.

p.s. to imp: No offense meant, you probably just
missed seeing that paragraph. It's the other people who mystify me.
Do people really imagine that Python, PHP, XQuery don't depend on
enormous libraries? I doubt that even 1% of the instructions
executed in those implementations have anything to do with the
problem being solved.

ncm, as you didn't mention C, I read "library function" more liberally than you clearly intended. I didn't "import", after all. ;) But mostly, I glossed over all but "a lot more fun to code" and so decided to have fun with Python's functional idiom. Sorry, never intended to represent it would do well on instruction count.

Don't chide me too much, now; you might tempt me to waste some more time attempting it in C. ;)

I'm still hoping to see a crushingly elegant O'Caml solution that compiles to fewer instructions than the best C version.

void itime(char * d)
{
int i, j, dd[7], tmax[] = {-1, 100, 12, -1, 24, 60, 60},
md[] = {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31};
for (i = 0; i < 14; i += 2)
dd[i/2] = (d[i] - '0') * 10 + (d[i+1] - '0');

tmax[3] = md[dd[2]-1];
if (dd[2] == 2 && !(dd[1] % 4) && (dd[1] != 0 || !(dd[0] % 10))) tmax[3]++;

dd[6]++;
for (i = 5; i >= 0; i--) {
while (dd[i+1] >= tmax[i+1]) {
dd[i+1] -= tmax[i+1];
dd[i]++;
}
}

for (i = 0; i < 14; i += 2) {
d[i] = (dd[i/2] / 10) + '0';
d[i+1] = (dd[i/2] % 10) + '0';
}
}

**Follow-up**, posted 18 Nov 2004 at 06:34 UTC by jab »
(Journeyer)

Just realized eskimoses did the same thing in python. Also, my "!(dd[0] % 10)" should be a "!(dd[0] % 4)" (leap on, off, on thing is every 4, 100, 400; not 4, 100, 1000 as I had originally).

But nonetheless, a demonstration that C code can be concise (ala python) without looking insane in comparison.

**My god**, posted 18 Nov 2004 at 11:53 UTC by freetype »
(Master)

ncm, I'm starting to think that you're taking this small competition

*much* too seriously. :-) I seriously doubt that specific CPU implementation details, memory cache latencies and instruction counts are really that interesting. Isn't this article just a fun way to exercise our "hacker" muscle byexploring fun way to exploit our favorite languages ?

Besides, your last implementation is truly great, except that I don't understand why you're using an exclusive or to compute leap years. I mean, since you're obviously focused on instruction count, you should know that something like a+2*b can be computed by a single instruction on x86.

And I'd be extremely surprised to see a shorter solution in OCaml, both in terms of line count and generated instructions. I know pretty well how this language is implemented, and I don't see how you could implement the same with less code without basically duplicating the C code. And does the memory allocator + GC counts as an external library in such case ?

Anyway, take care.

I agree with freetype that fun exploration is the main purpose here. However, there are different measures of elegance. If you think more elegance can be achieved by going outside some of the parameters of the original challenge, do so, but state which ones you're going outside of. Otherwise folks who eschewed a particular solution because it was out of bounds might feel like they're being made to look bad.

**Competition?**, posted 18 Nov 2004 at 17:17 UTC by ncm »
(Master)

I don't see this as a competition. It's a challenge: how much unnecessary computation can be squeezed out of the implementation of a small but necessarily messy function? If you're not interested in how your favorite language arranges to get its semantics executed, you're certain to make habitually bad choices, in cases where performance matters. Yes, it's just for fun, but as in everything fun there's something deadly serious lurking in the background.

(Incidentally, now that Moore's Law is petering out, we're all going to be counting cycles again pretty soon, and cursing those of our predecessors who wrote in languages not suitable for compilation to efficient code.)

I'm not focused on instruction count, any more than on line count. Both are approximate measures of the amount of computation invoked. Certainly any instruction that does multiplication is doing *way* more computation than a shift-xor-and, regardless of instruction count; treble that for divide or modulus. But maybe I miss your point. Are you reminding me that `((a<<1)+b)` can use spare address arithmetic machinery (as exposed by the common LEA instruction family), where `((a<<1)^b)` must use a complete ALU? Excellent point, maybe. OTOH, xor is cheaper to compute than addition, it's just that there are more adders scattered around in a typical CPU, and addition and xor both take only one cycle anyhow. (I wonder if we'll see CPUs that take two cycles for an addition because of the additional gate delay of the carry logic.)

nymia might have framed the problem most neatly. Suppose your solution were to be implemented in hardware? What would yield the smallest product of transistor-count times cycles? This measure is not language-dependent, and it emphasizes exploiting observations that may be applied in any language. It's expensive to quantify exactly, but we can do a pretty good job approximating it.

A solution in OCaml need not be shorter in line count if it's prettier. Certainly, though, it should have no need to invoke GC during execution of the function in question. I mentioned OCaml just because it is often (and almost uniquely) cited as producing code that is faster than the equivalent C or C++ code, at least for artificial problems (such as this one). A Common Lisp solution would suffice, too, because CL is commonly compiled to machine code.

My last implementation still has several bits of fat, at least. And you're right, I've been taking this way more seriously than it deserves -- but really, isn't that necessarily true about anything we hope to learn from?

I would expect a good compiler to optimize multiplication, division and modulo into bitwise operations if one of the operands is a constant power of two.

**last digit 0-8**, posted 19 Nov 2004 at 14:58 UTC by brlewis »
(Journeyer)

Is my optimization-focused solution the first one that actually returns after incrementing the last digit by one if no more operations are necessary? I'm a little surprised nobody showed interest in this optimization.

**returns**, posted 20 Nov 2004 at 01:21 UTC by ncm »
(Master)

brlewis: what does

`{ ++*cur; break; }` mean to

*you*? And, mightn't it be better never to compute the month

*at all* than to wait to compute it until circumstances come along where you could use it?

Just adding a qualifier to brlewis's comment:

Depending on the processor, replacing a multiplication, division, or modulo with bitwise operations may depend not only on one operand being a constant power of two, but also on the other operand being unsigned.

It looks like ringbark beat you to the punch ... his reply happens to contain a solution (albeit COBOL) that avoids unnecessary operations.

Sorry I did not communicate clearly in my last post.

My thought was that if you *first* test for the last digit being 0-8, you avoid any unnecessary operations. If speed is what you're after, you should order your tests such that the most common case comes first. The fact that most solutions don't work in this way shows that people are more interested in concise code than "way fast" code, which I actually agree with.

ncm, you do a number of tests before falling through to the most common case, and one of your tests has a bug. Try incrementing `'2002-04-14 05:15:15 rox!'`

I don't know the language myself, but the COBOL solution appears to convert the last two digits to an integer regardless of the value of the final digit, then add one to that digit.

**Bit twiddling**, posted 24 Nov 2004 at 14:28 UTC by mpr »
(Journeyer)

Here's a way to increment digits without conditional branches:

/* XXX: 32-bit-centric */
void
bump_hhmm(char *h)
{
int r, i, m, c = 1;
static int max[] = { 6, 10, 6, 10 };

for (i = 3; i >= 0; --i) {
r = d[i] - '0' + c;
m = ((r ^ max[i]) /* zero if r == max[i] */
+ 0x7fffffff) /* will overflow to msb if r != max[i] */
>> 31; /* m = 0 if r == max[i], ~0 otherwise */
d[i] = '0' + (r & m); /* zero out result if it's == max[i] */
c = (1 & m) ^ 1; /* next carry */
}
}

This is just for hours and minutes, though. I can't think of a "clean" way to extend it to situations where you need two digits to know the carry (hours, days and months).

Checking the last digit for 0-8 may not be faster. Hypothetically:

(a) If we don't check the last digit for 0-8, then the performance is:

90% x (T.increment) + 10% x (T.increment + T.reset)

vs. (b) If we check the last digit for 0-8, then the performance is:

100% x T.checkdigit + 90% x (T.increment)

So, excluding the check for carry, if each operation (T.increment, T.reset, and T.checkdigit) takes a clock cycle to execute, then over 10 runs, (a) takes 11 clock cycles while (b) takes 19 clock cycles.

As for the implicit integer conversion in the COBOL solution: PICs are how variables are declared in COBOL, as far as fundamental data types are concerned.

**Gah**, posted 29 Nov 2004 at 11:05 UTC by mpr »
(Journeyer)

When I wrote "hours and minutes" and "hhmm", of course I meant
"minutes and seconds" and "mmss"...

Umm... with all the talk of instruction counts and gate delays, I hesitate to bring up XSLT, but this issue is no longer academic for me.

I wrote schedScrape.xsl which takes a schedule expressed in XHTML (ala tantek's lowercase semantic web) and converts it to an RDF vocabuarly for iCalendar, since I already have code to go from there to actual .ics format. See the travel schedule on my homepage for an example.

The trick is, for a date like 7-8 Apr 2005, the iCalendar spec wants dtend to be 20050409 , i.e. one day after 8 Apr. My code ignores this, and has a bug as a result.

XSLT's standard library doesn't have mktime and the like, so I actually need the algorithm in this challenge.

After skimming all the solutions, I don't actually understand the algorithm. Maybe enlightenment will come if I study some more. Or maybe somebody will throw me a bone? Explain the algorithm, please?

**d'oh**, posted 7 Jan 2005 at 02:36 UTC by connolly »
(Master)

enlightment came from the COBOL version. Boy do I feel silly.