Jump to content

Factorial Program in C++ - 1 Million Factorial


Recommended Posts

A few years ago I made a php script to work out the exact factorial of any number. (www.thecrimelife.net/william) Please don't enter any excessively large numbers as it will exceed my CPU limit.

Recently I decided to have another crack at working out 1 million factorial. So I wrote a program (still the same basic concept) in C++.

It is still slow and took my computer (which is fast) about 10 hours (I think; left it running over night) to work it out.

The result can be downloaded at:

www.thecrimelife.net/factorial (It's the million.txt file, right click -> Save target as)

The file is jsut over 5MB as there are 5,565,708 digits.

You can download the application at: www.thecrimelife.net/factorial/factorial.exe. It is only text based but it does the job.

The reason for it being slow is that in can only use 1 core and is computing at 32bit. If you look in the factorial directory you can see I have compiled a 64bit version and duel core one (Doesn't seem to work though on both cores); I have yet to run it for 1mill to see the speed.

Unfortunately due to limits of array size in C++ the max amount of digits for the array is 100,000,000(The 64bit version could be larger but I never tried). I may make a multi-dimensional one some time.

I may give the source out on request; it's very messy with lots of comments.

Have fun working factorials...

Link to comment
Share on other sites

Guest Anonymous

Re: Factorial Program in C++ - 1 Million Factorial

My little C version....

[nyna@ns001 ~/maths/fact]$ time ./fact 1000000

real 0m12.295s

user 0m10.415s

sys 0m0.155s

[nyna@ns001 ~/maths/fact]$ ls -l digits.txt

-rw-r--r-- 1 nyna nyna 5.3M Apr 19 16:31 digits.txt

[nyna@ns001 ~/maths/fact]$ wc -c digits.txt

5565709 digits.txt

[nyna@ns001 ~/maths/fact]$ head -c 100 digits.txt


[nyna@ns001 ~/maths/fact]$ tail -c 100 digits.txt


Not highly optimized, and the timings include the time to write out the 5.3Mb digits.txt file, but fairly speedy

Link to comment
Share on other sites

Re: Factorial Program in C++ - 1 Million Factorial

I wonder why mine is so slow; I wonder if is to do with running it on windows, more likely it's the the inefficiency of my program :lol:.

I tried running it on a ARM 7 Chip (only 10,000 due to the on board flash size) which was quite quick as it was only using a 20MHz clock; It is capable of running at around 200MHz. It was difficult to preform many tests as I hadn't made and sort of display and was using a debugger to view the array value. Also I couldn't time as the Chip can't time it; although the debugger has a time somewhere but I can't find it....

How did you work it out? I came up with (where f=the number to work out the factorial of):

for (int m=0; m<f; m++)
	a=tm % 10;
	if (c!=0 && s>=n)
while (s<(n+1));
Link to comment
Share on other sites

Guest Anonymous

Re: Factorial Program in C++ - 1 Million Factorial

Well I use some pretty good FFTs which makes the big multiplcation nice and fast.

Link to comment
Share on other sites

  • 5 months later...

Re: Factorial Program in C++ - 1 Million Factorial


I found that your factorialx64.exe program runs 35% faster than the factorial.exe one. I ran them on 64-bit and 32-bit Microsoft operating systems, respectively. I have an Athlon dual-core system. Your factorialCore2-64.exe wasn't any faster than the factorialx64.exe program.


Would you be willing to post or email me an executable of your factorial program? Either for Linux or Windows? Also, could you have the output automatically sent to a file rather than stdout? I'm curious to see if it's accurate for 10000000!, 20000000!,. . . Based on your very impressive results, this is by far the fastest factorial program I've seen on the web!

Link to comment
Share on other sites

Re: Factorial Program in C++ - 1 Million Factorial

The core2 only makes a slight difference on an intel duel core I think. The 64 result is about right due to caclulations being done quicker. Are you running a x64 OS, should only have worked if you did? I run XPx64.

Link to comment
Share on other sites

Re: Factorial Program in C++ - 1 Million Factorial

ok. I should try and make mine faster; perhaps I'm going the wrong way about it but I can't think of any other way. Perhaps make a second pointer so the loop doesn't run through 0's. Or stick more than 1 character in each key of the array.

Link to comment
Share on other sites

  • 8 months later...
  • 5 years later...

I made a version that uses basic optimizations that make it quite fast, without using FFTs. I ended up with 50 seconds for computing the factorial represented as a long binary integer, and around 800 seconds to turn it into a string (in decimal) and save it to a file. I have a feeling I could make the second part faster, but I don't know how right now.

The main optimization is to max out the capability of the multiply instruction. That means multiplying several numbers together (until multiplying by another one would put it above 32 bits). before multiplying by the entire array. So, for 50000!, it would take 50000*49999 = 2499950000 < 2^32, and then it would let the program know that the next number it needs to worry about is 49998. This is combined with another optimization that factors two out of every integer, so, what it would it would do for, say 16000! is 125 * 15999 = 1999875 < 2^32, and let the program also know that the final number needs to be shifted over 8 additional bits once the multiplication is done.

Now, the output of GetMultiply is multiplied by the number in 32 bit parts, which can I implemented like this:

uint64_t Carry = 0;

for (int var = 0; var < IterStop; var++){

Carry= Number[var] * MultTo + ((Carry) >> 32);

Number[var] = Carry;//only transfers the lower 32 bits


This actually requires MultTo to be less than 31 bits so that the addition of Carry does not overflow.

Turning this number into a decimal string is slightly easier, just divide each 32 bit section by 1000000000 < 2^32 and the remainder of the previous division is bit-shifted 32 bits and added to the number below. I am bad at explaining this, so here is my implementation:

while(Number.size() > 0)

if(Number.back() == 0)


uint64_t Remainder = 0;

for_each(Number.rbegin(), Number.rend(), [&](uint32_t & Num){

Remainder <<= 32;

Remainder += Num;

Num = Remainder / 1000000000 ;

Remainder %= 1000000000 ;


Then, I convert the remainder to a backwards string representation

Finally, I flip the string around.

This is more efficient than doing the multiplication and the division in one step like Will does because this divides the same number, "Remainder" by the same dividend, and so the compiler can do a single divide instead of two (the x86 div instruction does modular and regular division every time it divides any number).

So there is some basic stuff that the mathematically uninclined can do to make their factorials faster.

ps. The source code is 180 lines long

This got me down to about 300 seconds for the binary representation of the 1000000 factorial. The final drop to 50 seconds was the realization that the x86 computer can do a full 64 by 64 bit multiply without truncating, and so I did some inline assembly and made the GetMultiply group to numbers less than 64 bits and got it to go 6 times faster, because it groups better and does half as many multiply instructions. But that is kind of hard if you don't know assembly.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Create New...