Thursday, July 30, 2009

How can I load info into a 17770x480000 element array?

I need to put that much info into a super fast accessable area. I first tried making thousands of files but...





Reading text files takes too long for each prediction, about 1 second. In order for this thing to be done within reasonable time, the predictions have to be made at about 50/s or else it'll take too long.





It would be really useful if i could load all the info into one giant matrix. For some reason or other, c++, or rather visual c++, or my pc, doesn't let me do this. I get an error at runtime. Can i fix this?





I could theoretically simulate a matrix on my harddrive, but in order for that to work, the memory on the harddrive must be accessible through numeric address using pointers. Is this possible?





(I would need about 3.2 gigs or memory one way or another. )





I've heard of people making giant bit streams, where you could read any bit by inputing an address? Does this use RAM, or the hardrive, or both? How would i do this?

How can I load info into a 17770x480000 element array?
You dont say what the platform is, but a 64 bit box with sufficient ram should be able to do this (You would probably need a 64 bit OS to have sufficient per process address space).





Does windows have an equivalent of the posix mmap call? That allows you to map a filesystem object into the processes address space and would let you work with a smaller amount of ram (but you would still need the VM space, so probably still a 64 bit platform).





How good is your data locality? can you partition the problem into multiple smaller datasets that even if run as one process each would ease your VM space constraints?





Is the matrix sparse? Can you take advantage of this?





Just some thoughts.
Reply:Are you really sure you need that much space? I've been working in CompSci for 25 years now, and I've never had the need for a matrix that big...





I don't think you'll be able to fix this problem using in-memory matrices: if you were trying to store single bytes for matrix elements, it would require 8 GB of RAM, and few computers have that.





So you'll have to store it in a file, and even that is iffy because lots of filesystems out there don't support files larger than 2 GB. But you might be able to split it up in several files.





Contact me at jacovkss2 at-sign yahoo.com for more info.


No comments:

Post a Comment