Measuring run time of a process or function

Hi

I'm trying to measure the CPU run time of a process.

Currently I'm using GetTickCount and GetTickFrequency but the given time includes the context switch time and the run time of other processes running along with my process.

Is there an API function to get the precise CPU time of the process without the dependency in the condition of the OS and other processes

An example for that is the CPU time given for each process in the Task Manager, but I need a milisecond measurement.

Hoping to get help... Thanks!



Answer this question

Measuring run time of a process or function

  • dragoncells

    I think you can try a solution based on performance counters:

    LARGE_INTEGER frequency;

    QueryPerformanceFrequency(&frequency); // (execute once)

     

    LARGE_INTEGER begin;

    QueryPerformanceCounter(&begin);

     

    // some portion of code to be measured

    . . .

     

    LARGE_INTEGER end;

    QueryPerformanceCounter(&end);

     

    LONGLONG elapsed = end.QuadPart - begin.QuadPart;

    LONGLONG nanoseconds = elapsed * 1000000000L / frequency.QuadPart;

    LONGLONG milliseconds = nanoseconds / 1000000L;< xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

    It seems the other processes do not affect the measured time.

     

    I hope this helps.

  • TonyX852

     Viorel. wrote:

    It seems the other processes do not affect the measured time.

     

    Its wrong. Run one instance of your mesure application end output a result.

    Then run sevrel instances of your mesure application simultaneously and compare results.

    They will quite diffrent.

    But QueryPerfomanceCounter is more precise.

    To avoid other process influence use, GetThreadTimes and GetProcessTimes.


  • Severin123

    Turn off hyper threading option in your computer.

    The and you will fill the diffrence.

    The reason: hyper threadin is hardware imulation of two procesor, but exactly one processor exists it runs in two times slowly. For OS your code runs on two diffrent processor in the same time. Thats why you see a diffrence in time but not between Ticks and ThreadTimes, because both of the thread are planned and working now, just proccessor works slowly.

    And if you turn off HT the only thread can be planed for OS at the same time. So you will see an essensial diffrence for Ticks and ThreadTimes.


  • aguess

    You are absolutely right!

     

    If I turn off hyper threading or use my older PC (751MHz Pentium III) everything works as expected and I have no problems reproducing your results.  :)  :)  :)

     

    The following article describes how to disable hyper threading for a process by setting the process's affinity mask to only one of the logical CPUs for each physical CPU:
    http://msdn.microsoft.com/msdnmag/issues/05/06/HyperThreading/

     

    Then everything also works as expected with hyper threading enabled.
    Here my translation of your C++ program into a C program that incorporates this.

    The function SetProcessAffinityToPhysicalCPUForHyperthreadOnly() currently assumes that hyper threading is enabled.

    I'll fix that later.

     

    #include <windows.h>
    #include <stdio.h>


    static void SetProcessAffinityToPhysicalCPUForHyperthreadOnly( int   processid )
    {
        HANDLE   hProcess;
        DWORD    ProcAffinityMask = 0, SysAffinityMask = 0;
       
        hProcess = OpenProcess( PROCESS_ALL_ACCESS, 0, processid );
        GetProcessAffinityMask( hProcess, &ProcAffinityMask, &SysAffinityMask );
       
        if ( SysAffinityMask == 3 ) /* 1 proc, 2 logical CPUs */
            SetProcessAffinityMask( hProcess, 1 );
        else if ( SysAffinityMask == 15 )  /* dual proc, 4 virtual CPUs */
            SetProcessAffinityMask( hProcess, 3 );
           
        CloseHandle( hProcess );
    }


    int   main( void )
    {
        __int64   creBeg, extBeg, krnBeg, usrBeg;
        __int64   creEnd, extEnd, krnEnd, usrEnd;
        __int64   i;
        __int64   krn, usr;
        DWORD     beg = GetTickCount(), end;
       
        SetProcessAffinityToPhysicalCPUForHyperthreadOnly( GetCurrentProcessId() );
        GetProcessTimes( GetCurrentProcess(), (FILETIME*)&creBeg, (FILETIME*)&extBeg, (FILETIME*)&krnBeg, (FILETIME*)&usrBeg );
        for( i=0; i<10000000000; i++ );
        end = GetTickCount();
        GetProcessTimes( GetCurrentProcess(), (FILETIME*)&creEnd, (FILETIME*)&extEnd, (FILETIME*)&krnEnd, (FILETIME*)&usrEnd );

        krn = krnEnd - krnBeg;
        usr = usrEnd - usrBeg;
        krn /= 10000;
        usr /= 10000;
        end-=beg;

        printf( "ProcessTimes: %d ms\n", (DWORD)usr+krn );
        printf( "Ticks: %d ms\n", (DWORD)end );
       
        return 0;
    }


  • Joao Richiard

    Aleksandr,

    Thank you very much for your valued feedback.  Unfortunately I am not able to reproduce your results.  I still experience that other processes running heavily influence the results I get.  I am sure that this has an explainable reason.

    Some background information:
    My field of expertise is UNIX and C programming and neither Windows nor C++ programming.  My brute force sudoku solver is my first C program under Windows.  I have Windows XP professional (SP2).  I am using the Open Watcom C/C++ compiler and instruct it to generate a Windows NT character-mode (command window) executable. 

    The Open Watcom compiler would not compile your program C++ as is.  Instead of pulling out my C++ books, I took the liberty of translating your C++ program into the following C program:


    #include <windows.h>
    #include <stdio.h>

    int   main( void )
    {
        __int64   creBeg, extBeg, krnBeg, usrBeg;
        __int64   creEnd, extEnd, krnEnd, usrEnd;
        __int64   i;
        __int64   krn, usr;
        DWORD     beg = GetTickCount(), end;
       
       
        GetThreadTimes( GetCurrentThread(), (FILETIME*)&creBeg, (FILETIME*)&extBeg, (FILETIME*)&krnBeg, (FILETIME*)&usrBeg );
        for( i=0; i<10000000000; i++ );
        end = GetTickCount();
        GetThreadTimes( GetCurrentThread(), (FILETIME*)&creEnd, (FILETIME*)&extEnd, (FILETIME*)&krnEnd, (FILETIME*)&usrEnd );

        krn = krnEnd - krnBeg;
        usr = usrEnd - usrBeg;
        krn /= 10000;
        usr /= 10000;
        end-=beg;

        printf( "ThreadTimes: %hd ms\n", (DWORD)usr+krn );
        printf( "Ticks: %hd ms\n", (DWORD)end );
       
        getchar();
        return 0;
    }


    The only change I made is that I increased the for loop count from your 1 billion to 10 billions, to have the program run some 16 seconds.  Hence I also had to change the variable ‘i’ from int to __int64.

    If I only run one copy of the program I get results like:
              ThreadTimes: 16703 ms
              Ticks: 16703 ms

    If I run two copies simultaneously, I get results like:
              ThreadTimes: 28375 ms
              Ticks: 28375 ms

              ThreadTimes: 28484 ms
              Ticks: 28484 ms


    If you would be so kind and compile my C program with your compiler and execute it in a command window and then give me your results, I would be very grateful.  If you get results identical to the ones with your C++ program, I probably have a (older) flawed “kernel32.dll”.

    My email address is:   borge[AT]anderssen[DOT]org

    Thank you very much in advance for your help.

    Borge

  • Ke Sun

    Thank you, when I run two instances I obtain the same result as you. The diffrence essensial when I ran 5 instances of the same application.
  • docturny

    I have tested out all the functions GetProcessTimes(), GetThreadTimes() and QueryPerformanceFrequency().  The results I obtain with all the functions are extremely inaccurate and far closer to elapsed (wall-clock) time than used CPU time.  Other processes running all heavily influence the results obtained with all 3 functions.

     

    To me it seems that the more CPU bound the process you try to measure the CPU time for, the more inaccurate and closer to wall-clock time are the results you get.  My favorite test program is my brute force sudoku solver that when running alone finds 1,004,634 solutions to a given test sudoku in approx. 16 seconds of measured CPU time.  If I run 2 processes of this program simultaneously, the CPU time measured for each processes is some 26 seconds, which equals elapsed time!  By the way, also the CPU time stated for each process in the Task Manager is for my test program equal to elapsed (wall-clock) time and not used CPU time.


  • EwenTweedie

    To avoid influence of other Processes(even a debugger), use

    BOOL GetThreadTimes(
    HANDLE hThread,
    LPFILETIME lpCreationTime,
    LPFILETIME lpExitTime,
    LPFILETIME lpKernelTime,
    LPFILETIME lpUserTime
    );

    BOOL GetProcessTimes(
    HANDLE hProcess,
    LPFILETIME lpCreationTime,
    LPFILETIME lpExitTime,
    LPFILETIME lpKernelTime,
    LPFILETIME lpUserTime
    );

    It works only in NT-platrorm. But it helpful.


  • Mike Thurber

    __int64 creBeg, extBeg, krnBeg, usrBeg;

    __int64 creEnd, extEnd, krnEnd, usrEnd;

    ::GetThreadTimes(::GetCurrentThread(), (FILETIME*)&creBeg, (FILETIME*)&extBeg, (FILETIME*)&krnBeg, (FILETIME*)&usrBeg);

    DWORD beg = GetTickCount(), end;

    for(int i=0;i<1000000000;i++);

    end = GetTickCount();

    ::GetThreadTimes(::GetCurrentThread(), (FILETIME*)&creEnd, (FILETIME*)&extEnd, (FILETIME*)&krnEnd, (FILETIME*)&usrEnd);

    __int64 krn = krnEnd - krnBeg;

    __int64 usr = usrEnd - usrBeg;

    krn /= 10000;

    usr /= 10000;

    end-=beg;

    cout<<"ThreadTimes: "<<(DWORD)usr+krn<<"ms"<<endl;

    cout<<"Ticks: "<<(DWORD)end<<"ms"<<endl;

    cin.get();

    If we run once we obtain:
    ThreadTimes: 4937ms
    Ticks: 5000ms

    If we run severel instances simultaneously:
    ThreadTimes: 4968ms
    Ticks: 23437ms

    Here, you see, that influence of other process is negligible for GetThreadTimes function. Precision of this function not high, but sevrel exicution of the same code will helps.


  • Measuring run time of a process or function