On your first question – whenever you call the .Solve() function the system will be solved in parallel (OpenMP, CUDA, etc). So it makes no sense to try to solve the three systems in parallel (it will not give you any performance benefits).
We had tried this on CUDA 5.5 and with Linux, we will try to test it for Windows. I will keep you posted on that (but this might take couple of days).
But as I understand there is no way now to make one system be solved on CPU and another on accelerator IN PARALLEL. Not one system on CPU and THEN another system on GPU.
The question is “Is Paralution thread safe library?”
Good question. We have never tried that – I think it should work. You need to have two fully independent problems (2 x matrix, rhs, x). And then in the first thread you move matrixA, rhsA and xA to the accelerator and in the second thread you can keep matrixB, rhsB, and xB on the host and solve them. This should work.