Hi everyone, Currently our company is in the process of converting from a mainframe environment to PCs. As a pilot we took an application that processses about 5 million rows with about a record length of 7000 bytes
the stats were amazing
C# it took 3 minutes 17 seconds,
C++ (native and not .net) 1 minute 28 seconds
and believe it or not the COBOL program took only 45 seconds.
The program has a simple read of a record, process few business rules and replace certain bytes in the string and write back to a new file.
I expected C++ to perform better.
does anyone has any idea why the COBOL program performed better than the C++ program. The data files were the same, the logic were the same, the machine it was executed was the same but performance was drastically different.
Any help or tips will be greatly appreciated.
C Code is below
FILE * FileToRead;
FILE * FileToWrite;
FileToRead= fopen("c:\\input\\BAC100M.TXT","rt");
FileToWrite = fopen("c:\\Data\\output\\BACOUT.TXT","wt");
char LineData[7002];
fgets(LineData,7001,FileToRead);
while (!feof(FileToRead))
{
if (!feof(FileToRead))
{
fputs(LineData,FileToWrite);
LineData = ProcessLine((std::string)LineData).substr(0,7001);
fgets(LineData,7001,FileToRead);
}
}

performance Comparison between C++, C# and COBOL
AnaC
Thanks, Reeve. I definitely think the string copies may be the problem and I have a practical solution for it, but I'd like to benchmark this first. Could you send me your file, or just one row of it You can email me this file.
Brian
MW1239
"What I heard from my COBOL guys is that they can read in a row to a data structure directly and doesn't have to sustring to get at a specific field. Is there anyting in C++ that I can read a row of data into a data structure without doing any substrings. "
What does a row look like Comma/tab delimited, fixed column width There are always certain optimizations to be made. You could even just chop a row by placing null terminators, and have a structure of char* that points into it.
In general, there's really not a whole lot of "magic" that can be done lower than C coding, which is basically what you're doing. Go any lower, you have hand coded assembly, and I don't see a big win that way either.
If you can post reading/processing/writing code (with a text file), I can see if I can spot a bottleneck for you. You can post it to my email address if you want, but just be sure that it's complete code (filling in blanks is counterproductive).
Brian
AlucardHellSing
From my subjective testing scenarios, Visual C++ 2005 is slower than Visual C++ 2003. IMHO, with all the problems (bugs) baked into 2005 RTM, there was not nearly enough time to test performance metrics.
My suggestion: Retest using Visual Studio 2003 (NOT Standard Edition as it has a horribly neutered optimization). Compare your test results to VS 2005.
drew_p
It's not clear what the second poster is getting at. There's only one "Visual Studio 2003" compiler, and only one Service Patch to that compiler. I would not expect the quality of the generated code to be significantly different between 2003 (VC 7.1) and 2005 (VC 8.0), but there's always that chance since they do generate different code.
Make doubly sure you're using optimized code. Don't bother benchmarking code from a Debug configuration.
Note that use of the Microsoft implementation of the Standard C++ library can be slower due to security checks. Also, I would see if you're doing more string manip. in your C++ version than in the Cobol version. The substr() you have creates a new string; maybe it doesn't in the Cobol version.
If you have Visual Studio Team System, you can use the profiler to see where your performance bottleneck is. Ideally, you want the bottleneck to be in the I/O part (80% or more of the time, I would guess based on what you're doing.)
Brian
boran_blok_edan
The File is a fixed length file.
the code is below. thanks again for your help
// For Stream Processing
#include
<iostream>#include
<fstream>using
namespace std;// for simple IO
#include
<string>#include
<stdio.h>#include
<time.h>std::string ProcessLine(std::string& DataString);
//void ProcessLine(std::string& DataString);
void
UseStreams();/*void Simple();*/
void
main(){
UseStreams();
}
void
UseStreams(){
ifstream FileToRead(
"c:\\Data\\input\\BACM.TXT");ofstream FileToWrite(
"c:\\Data\\output\\BACOUT.TXT"); char LineData[7002];time_t seconds;
seconds = time(NULL);
string CurString;
FileToRead.read(LineData,7001);
while (FileToRead){
//FileToWrite << ProcessLine((std::string)LineData).substr(0,7001);FileToWrite << ProcessLine((std::string)LineData).substr(0,7001);
FileToRead.read(LineData,7001);
}
printf(
"completed in %1d in Seconds", (time(NULL) - seconds));FileToRead.close();
FileToWrite.close();
}
/*
void Simple()
{
FILE * FileToRead;
FILE * FileToWrite;
FileToRead= fopen("c:\\MBNADATA\\Data\\input\\BAC100M.TXT","rt");
FileToWrite = fopen("c:\\MBNADATA\\Data\\output\\BACOUT.TXT","wt");
char LineData[7002];
time_t seconds;
seconds = time(NULL);
fgets(LineData,7001,FileToRead);
while (!feof(FileToRead))
{
fputs(LineData,FileToWrite);
fgets(LineData,7001,FileToRead);
}
printf( "completed in %1d in Seconds", (time(NULL) - seconds));
fclose(FileToRead);
fclose(FileToWrite);
}
*/
std::string ProcessLine(std::string& DataString)
//void ProcessLine(std::string& DataString)
{
int i ; for (i=1; i<=78; i++){
std::string PEND_CNTR_TYPE_A =DataString.substr(3382, 1);
std::string PEND_CNTR_BUSUNT_A = DataString.substr(3383, 3);
std::string DRVD_CASH_INDICATOR =DataString.substr(3455, 1);
std::string DRVD_CASH_TYPE =DataString.substr(3456, 2);
std::string PEND_CNTR_INDEX_A = DataString.substr(3389, 10);
std::string CNTR_INDEX_A = DataString.substr(3246, 10);
if (i == 1 ){
//cout << DataString.length( ; //cout << "PEND_CNTR_TYPE_A"+PEND_CNTR_TYPE_A + "\n";}
//Rule LEG50 if (PEND_CNTR_TYPE_A == "F" &&(PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"CA"&& PEND_CNTR_INDEX_A ==
"FIXED " ){
DataString = DataString.substr(0, 1350) +
"A" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"CA"&& PEND_CNTR_INDEX_A ==
"FIXED "){
DataString = DataString.substr(0, 1350) +
"A" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& DRVD_CASH_TYPE ==
"CA"&& CNTR_INDEX_A ==
"FIXED "&& PEND_CNTR_BUSUNT_A !=
"009"&& PEND_CNTR_BUSUNT_A !=
"011"&& PEND_CNTR_BUSUNT_A !=
"012"&& PEND_CNTR_BUSUNT_A !=
"021") //3{
DataString = DataString.substr(0, 1350) +
"B" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "F" &&(PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"CA" ) //4{
DataString == DataString.substr(0, 1350) +
"B" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"CA") //5{
DataString = DataString.substr(0, 1350) +
"B" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& DRVD_CASH_TYPE ==
"CA" ) //6{
DataString = DataString.substr(0, 1350) +
"B" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "F" &&(PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& (DRVD_CASH_TYPE ==
"CK"|| DRVD_CASH_TYPE ==
"AD")&& PEND_CNTR_INDEX_A ==
"FIXED " ) //7{
DataString = DataString.substr(0, 1350) +
"C" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& (DRVD_CASH_TYPE ==
"CK"|| DRVD_CASH_TYPE ==
"AD")&& PEND_CNTR_INDEX_A ==
"FIXED ") //8{
DataString = DataString.substr(0, 1350) +
"C" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& (DRVD_CASH_TYPE ==
"CK"|| DRVD_CASH_TYPE ==
"AD")&& CNTR_INDEX_A ==
"FIXED "&& PEND_CNTR_BUSUNT_A !=
"009"&& PEND_CNTR_BUSUNT_A !=
"011"&& PEND_CNTR_BUSUNT_A !=
"012"&& PEND_CNTR_BUSUNT_A !=
"021") //9{
DataString = DataString.substr(0, 1350) +
"C" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "F"&& (PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& (DRVD_CASH_TYPE ==
"CK"|| DRVD_CASH_TYPE ==
"AD")) //10{
DataString = DataString.substr(0, 1350) +
"D" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& (DRVD_CASH_TYPE ==
"CK"|| DRVD_CASH_TYPE ==
"AD")) //11{
DataString = DataString.substr(0, 1350) +
"D" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& (DRVD_CASH_TYPE ==
"CK"|| DRVD_CASH_TYPE ==
"AD")) //12{
DataString = DataString.substr(0, 1350) +
"D" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "F" &&(PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"BK"&& PEND_CNTR_INDEX_A ==
"FIXED ") //13{
DataString = DataString.substr(0, 1350) +
"E" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"BK"&& PEND_CNTR_INDEX_A ==
"FIXED ") //14{
DataString = DataString.substr(0, 1350) +
"E" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& DRVD_CASH_TYPE ==
"BK"&& CNTR_INDEX_A ==
"FIXED "&& PEND_CNTR_BUSUNT_A !=
"009"&& PEND_CNTR_BUSUNT_A !=
"011"&& PEND_CNTR_BUSUNT_A !=
"012"&& PEND_CNTR_BUSUNT_A !=
"021") //15{
DataString = DataString.substr(0, 1350) +
"E" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "F" &&(PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"BK") //16{
DataString = DataString.substr(0, 1350) +
"F" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"BK") //17{
DataString = DataString.substr(0, 1350) +
"F" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& DRVD_CASH_TYPE ==
"BK") //18{
DataString = DataString.substr(0, 1350) +
"F" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "F" &&(PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"CR"&& PEND_CNTR_INDEX_A ==
"FIXED ") //19{
DataString = DataString.substr(0, 1350) +
"G" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"CR"&& PEND_CNTR_INDEX_A ==
"FIXED ") //20{
DataString = DataString.substr(0, 1350) +
"G" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& DRVD_CASH_TYPE ==
"CR"&& CNTR_INDEX_A ==
"FIXED "&& PEND_CNTR_BUSUNT_A !=
"009"&& PEND_CNTR_BUSUNT_A !=
"011"&& PEND_CNTR_BUSUNT_A !=
"012"&& PEND_CNTR_BUSUNT_A !=
"021") //21{
DataString = DataString.substr(0, 1350) +
"G" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "F" &&(PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"CR") //22{
DataString = DataString.substr(0, 1350) +
"H" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"CR") //23{
DataString = DataString.substr(0, 1350) +
"H" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& DRVD_CASH_TYPE ==
"CR") //24{
DataString = DataString.substr(0, 1350) +
"H" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "F" &&(PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& (DRVD_CASH_TYPE ==
"BC"|| DRVD_CASH_TYPE ==
"BD")&& PEND_CNTR_INDEX_A ==
"FIXED ") //25{
DataString = DataString.substr(0, 1350) +
"I" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& (DRVD_CASH_TYPE ==
"BC"|| DRVD_CASH_TYPE ==
"BD")&& PEND_CNTR_INDEX_A ==
"FIXED ") //26{
DataString = DataString.substr(0, 1350) +
"I" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& (DRVD_CASH_TYPE ==
"BC"|| DRVD_CASH_TYPE ==
"BD")&& CNTR_INDEX_A ==
"FIXED "&& PEND_CNTR_BUSUNT_A !=
"009"&& PEND_CNTR_BUSUNT_A !=
"011"&& PEND_CNTR_BUSUNT_A !=
"012"&& PEND_CNTR_BUSUNT_A !=
"021") //27{
DataString = DataString.substr(0, 1350) +
"I" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "F"&& (PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& (DRVD_CASH_TYPE ==
"BC"|| DRVD_CASH_TYPE ==
"BD")) //28{
DataString = DataString.substr(0, 1350) +
"J" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& (DRVD_CASH_TYPE ==
"BC"|| DRVD_CASH_TYPE ==
"BD")) //29{
DataString = DataString.substr(0, 1350) +
"J" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& (DRVD_CASH_TYPE ==
"BC"|| DRVD_CASH_TYPE ==
"BD")) //30{
DataString = DataString.substr(0, 1350) +
"J" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "F" &&(PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"RE"&& PEND_CNTR_INDEX_A ==
"FIXED ") //31{
DataString = DataString.substr(0, 1350) +
"K" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"RE"&& PEND_CNTR_INDEX_A ==
"FIXED ") //32{
DataString = DataString.substr(0, 1350) +
"K" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& DRVD_CASH_TYPE ==
"RE"&& CNTR_INDEX_A ==
"FIXED "&& PEND_CNTR_BUSUNT_A !=
"009"&& PEND_CNTR_BUSUNT_A !=
"011"&& PEND_CNTR_BUSUNT_A !=
"012"&& PEND_CNTR_BUSUNT_A !=
"021") //33{
DataString = DataString.substr(0, 1350) +
"K" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "F" &&(PEND_CNTR_BUSUNT_A ==
"000"|| PEND_CNTR_BUSUNT_A ==
"009"|| PEND_CNTR_BUSUNT_A ==
"011"|| PEND_CNTR_BUSUNT_A ==
"012"|| PEND_CNTR_BUSUNT_A ==
"021")&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"RE") //34{
DataString = DataString.substr(0, 1350) +
"L" + DataString.substr(1350 + 1);}
else if (PEND_CNTR_TYPE_A == "R"&& DRVD_CASH_INDICATOR ==
"2"&& DRVD_CASH_TYPE ==
"RE") //35{
DataString = DataString.substr(0, 1350) +
"L" + DataString.substr(1350 + 1);}
else if (DRVD_CASH_INDICATOR == "2"&& DRVD_CASH_TYPE ==
"RE" ) //36{
DataString = DataString.substr(0, 1350) +
"L" + DataString.substr(1350 + 1);}
else{
DataString = DataString.substr(0, 1350) +
"ZZ" + DataString.substr(1350 + 2); //cout << "in ZZ \n";}
}
return DataString ;}
ricky rich
lachlanj
I worked with Reeves offline. A dramatic speed improvement resulted from the following:
1. we don’t use substr to extract fields, but rather use a new class to simply keep track of the offset and length into the original string.
2. we don’t use string operations to make character substitutions—we simply just make the replacement directly into the string.
Moral of the story: watch out for the underlying cost of using substr() and + on strings, especially on long ones.
Brian
tork1
Yes, there is only one Visual Studio C++ compiler, but if you've made the mistake of purchasing Visual C++.NET Standard Edition 2003, as years ago I did, then guess what... your C++ optimizing compiler does not really do optimization. While it's a shame to spend $200 for a C++ compiler that does not optimize your code, in no way do I blame the C++ team for this - I'm sure this was some sort of business decision, where a clever executive decided .NET really needed to be sold, so the C++ "optimizing" compiler would only included if you bought the complete set of .NET languages (i.e. Visual Studio), and not included with the tool specialized for C++.
Sorry - I got sidetracked. All I was saying is if you are benchmarking Visual C++ 2003 be sure that you don't use Visual C++.NET Standard Edition 2003 as your metrics may be way off. Sorry for not being more clear. (and less rambling)
bkejser
Thanks Brian can you send me your email or email me at reeves.edward@tabsdirect.com
Thanks
AlexBB
If you are running your test case as single-threaded application on VS2005, you may see performance overhead of multi-threaded library -- you don't need synchronization and critical sections in this case, but multi-threaded library does everything to ensure thread safety.
Take a look at http://forums.microsoft.com/MSDN/ShowPost.aspx PostID=253978&SiteID=1 for details.
One of the posters there recommended using _fread_nolock() instead of fread().
Eugene
desilets
I have set optimization to /O2 (speed). Is there anything that I can set for speed.
Also I notices that in C# when I use the filestream object to open a file and write back is faster than C++ fgets and Fputs (54 in C# vs 89 seconds in C++). Am I doing something wrong.
What I heard from my COBOL guys is that they can read in a row to a data structure directly and doesn't have to sustring to get at a specific field
Is there anyting in C++ that I can read a row of data into a data structure without doing any substrings.