Hi,
im getting a "out of memory" exception when I try to pass a large char array in the
the string construtor. The array has a length of about 200MB, its XML Data.
I need to convert this to a string for the further handling.
For smaller data stuff works fine.
Any ideas how to convert the data to a string
Thanks
Dob

"OUT of Memory" String construtor
mikebk
Check this, you might get some info from here:
http://www.pcreview.co.uk/forums/thread-2539908-1.php
http://www.codeproject.com/dotnet/strings.asp df=100&forumid=13838&exp=0&select=773966
CL0CKW0RK
tongkusat
As intermediate solution we solved the problem on the sender side by sending smaller portions.
Dob
Benedikt
kid_kaneda
if u have more data then u must use the System.Text.StringBuilder class instead of String.
Hope it is usefull for u.
Noorul Ahmed
Just throwing my two cents in. You can't use the XML DOM to load the data. The overhead is too high. I use to work with 85MB XML files and it easily took 500MB to over a 1GB. MS doesn't actually support loading anything over 4 or 5 MB in the DOM. Tools like XmlSpy, VS or IE won't load XMLs file that large either (or they'll start to act wierd). Honestly you don't want to load that much in memory. I've never seen a case where you MUST have all this data in memory. In fact I'm not aware of any application that would ever attempt to load so much data. In all cases applications either use memory mapping, segmentation or a database.
Streaming is the best approach. Using a forward/backward stream you can move the window of data in memory as you need it. For searching you start at the current point and scan forward.
However if you really, really need that much in memory then you can alternatively use a memory mapped file. You'll have to create a managed version or use one of the versions floating around but this allows you to rely on Windows to load the data as needed. You'll still end up mucking with pointers though. Of course if you're just loading it as a string then indexing is probably your friend anyway. Also keep in mind that decomposition is good for working with really large files. With decomposition you use streaming or another navigation technique to avoid bringing everything into memory. When you need to do some work (such as update child elements) you load just those elements into memory using the DOM or equivalent. It is slow but it works and doesn't eat up memory.
Finally as far as string size goes there is no limit (beyond available virtual memory). However remember that the string must be allocated contigously so you have to have 200MB of contigous free space in order to allocate the string. This is mostly likely why you are receiving the error. There simply isn't enough room in the large object heap to store such a large block of memory. You can verify that it isn't the string itself by creating a simple string of length 200MB (actually 400MB as it is Unicode) in a simple console application. It'll work the first few times you try it.
Michael Taylor - 12/8/06
cplusplus1
Yeah I appreciate what you're saying, but it does seem from his original comment that he needs the whole thing in memory at once, otherwise he would have parsed bits of it surely I know that the Xml Document Object model will be pretty heavyweight, but it was just one suggestion - along with streaming which I mentioned first up :)
Jassim Rahma
Hi Guys,
thanks for you comments.
Like louthy said, I need the whole thing in memory, because the this is not the only way how we
receive data, we also get the stuff from a web service and from TCP socket connections.
So, we separated the "transportation" from processing. Originally we didn't have so big
data packages in mind.
The rest of the processing queue relies on the fact to have everything as complete as a string,
what shouldn't be a problem with the available memory.
So my question is who limits the memory for a process and how can I change the limit
Dob
Old Jeffrey Zhao
Ron L
Dylan Smith
I did so further testing and found out it there musst be a memory limitation for a
.net process. Can I manipulate this limit I have plenty of system memory left.
Here is my testing Code:
protected string DoReadFile(string strFileName)
{
// Create an instance of StreamReader to read from a file.
String strContent = "";
FileInfo fi = new FileInfo(strFileName);
int size = int.MaxValue;
if(fi.Length < int.MaxValue)
{
size = (int)fi.Length; // size is 205478544
}
else
{
Log.Warning("Interface/TransportFile/DoReadFile", "Warning file is bigger than max buffer size!", new Param("FileName", strFileName) );
}
char[] buffer = new char[10000];
using(StreamReader sr = new StreamReader(strFileName, _Encoding))
{
StringBuilder str = new StringBuilder(size + 10000);
str.EnsureCapacity(size + 10000);
int read = sr.ReadBlock(buffer, 0, 10000);
int total = read;
while(read == 10000)
{
str.Append(buffer);
read = sr.ReadBlock(buffer, 0, 10000);
total += read;
}
// Test how much memory I can get.
char[] Test1 = new char[total];
char[] Test2 = new char[total];
char[] Test3 = new char[total];
char[] Test4 = new char[total];
str.Append(buffer);
strContent = str.ToString(0, total); // <- usually is crash here
}
return strContent;
}
UltimateSniper
Another option is maybe to create a LongString class to replace your string usage:
public class LongString
{
const int partSize = 1024;
bool closed = false;
List<string> items = new List<string>();
public void Append(string part)
{
Debug.Assert(part.Length < partSize);
Debug.Assert(!closed);
if (part.Length < partSize)
{
// last piece
closed = true;
}
items.Add(part);
}
public char this[long index]
{
get
{
int itemIndex = (int)(index / partSize);
return items[itemIndex][(int)(index % partSize)];
}
}
}
Obviously you'd need to add your own methods for Substring, IndexOf etc. But it would allow the CLR's memory manager a bit more of a fighting chance with your data, but it would also allow a single interface to the data.
You could build up the string in 1k chunks using the Append method.