Can anyone provide guidance on what would be the fastest method of searching a large byte array for some particular sequence of bytes I know I could convert it to a string and do an IndexOf but I would like to avoid that. Any ideas
You can opt to come out the code of KMP pattern matching algorithm. That will allows you to match almost anything you want. KMP => Knuth Morris Pratt algorithm.
Peter Ritchie comments are spot on, you need to check each byte.
I just knocked up this :
using System; using System.Collections.Generic; using System.Text;
namespace FindByteArrayinByteArray { class Program { static int indexOf(byte[] array, byte[] value) { int found = 0; for (int i = 0; i < array.Length; i++) { if (array[ i ] == value[found]) { if (++found == value.Length) { return i - found + 1; } } else { found = 0; } } return -1; }
For big byte array (20 more items maybe), you can copy it and use quick-sort to sort it then use binary-search...
Even if you were just looking for a single byte, I'd guess that the array would have to be much large (several thousand, I'd say), before sorting becomes faster than a linear search. (actually, as the search would be O(N) and the sort O(N log N), unless you can get several searches out of one sort, it would never be faster)
if you were looking for the bytes 03 02 01 in the following array: 01 01 02 02 03 03 03 02 01 01 01 02 02 02 03 03, if you sorted that array looking for (also sorted) 01 02 03 you'd now not find any matches in: 01 01 01 01 01 02 02 02 02 02 02 03 03 03 03 03
I was wondering if you how to handle searching a byte array within a byte array which my span several blocks. For example : I receive a buffer (byte array) which I need to search for a sequence of bytes (works great with your example) but the problem is that the I can recieve multiple blocks of byte array's and the sequence of bytes which I am searching might span one or more blocks - which makes it difficult to search without somehow concatenating the blocks and then searching - that would incur a performance overhead which I would like to avoid. Any thoughts
Fast searching of a byte array
iscbaltazar
Vikcia
Hello,
Peter Ritchie comments are spot on, you need to check each byte.
I just knocked up this :
using System;
using System.Collections.Generic;
using System.Text;
namespace FindByteArrayinByteArray
{
class Program
{
static int indexOf(byte[] array, byte[] value)
{
int found = 0;
for (int i = 0; i < array.Length; i++)
{
if (array[ i ] == value[found])
{
if (++found == value.Length)
{
return i - found + 1;
}
}
else
{
found = 0;
}
}
return -1;
}
public static void Main()
{
System.Console.WriteLine(indexOf(new byte[] { 1, 2, 3, 4 }, new byte[] { 1, 2, 3 }));
System.Console.WriteLine(indexOf(new byte[] { 1, 2, 3, 4 }, new byte[] { 3, 2, 3 }));
System.Console.WriteLine(indexOf(new byte[] { 1, 2, 3, 4 }, new byte[] { 2, 3, 4 }));
System.Console.WriteLine(indexOf(new byte[] { 1, 2, 3, 4 }, new byte[] { 4 }));
System.Console.ReadKey();
}
}
}
Thanks
Steve
Tony B
Cryo75
Even if you were just looking for a single byte, I'd guess that the array would have to be much large (several thousand, I'd say), before sorting becomes faster than a linear search. (actually, as the search would be O(N) and the sort O(N log N), unless you can get several searches out of one sort, it would never be faster)
Janet666
Adam81
Thanks SteveDrake - you have been a good help...
I was wondering if you how to handle searching a byte array within a byte array which my span several blocks. For example : I receive a buffer (byte array) which I need to search for a sequence of bytes (works great with your example) but the problem is that the I can recieve multiple blocks of byte array's and the sequence of bytes which I am searching might span one or more blocks - which makes it difficult to search without somehow concatenating the blocks and then searching - that would incur a performance overhead which I would like to avoid. Any thoughts
Yu Ro
MitchPetel
For big byte array (20 more items maybe), you can copy it and use quick-sort to sort it then use binary-search...
But it depends..