By default Pylearn2 loads all the dataset to the main memory (not GPU). This could be problematic for large datasets. There exists multiple Python/Numpy solutions for dealing with large data:
Pylearn2 currently only supports Pytables and h5py. (memmap support has been introduced in the latest version of Theano. But it has not been tested with Pylearn2 yet.)
pylearn2.datasets.dense_design_matrix.DenseDesignMatrixPyTables is designed to mimic the behaviour of DenseDesignMatrix but underneath it stores the data in PyTables hdf5 file format. pylearn2.datasets.svhn.SVHN is a good example of how to make a DenseDesignMatrixPyTables object and store your data in it.
If you have your data already saved in hdf5 format, you can use pylearn2.datasets.hdf5.HDF5Dataset class to access your data in Pylearn2. For an example of how to save data in hdf5 format and load it with HDF5Dataset, take a look at pylearn2.datasets.tests.test_hdf5.TestHDF5Dataset.
Each library has its own comparison:
One advantage of h5py over PyTables is that, one can use hdf5 files made with other libraries. Whereas PyTables hdf5 files are not standard. But PyTables claims to be more fast and supports compression.