Change the row length of 1bit color depth pixmap to the actual stride.
The previous implementation use the width as its stride which is not
good. As it will waste 8 times of space and also bring some non-unify
code path. With this commit, we can merge those 1bit or other color
depth to almost one code path. And we will use pixel buffer object
as much as possible due to performance issue. By default, some mesa
hardware driver will fallback to software rasterization when use
glReadPixels on a non-buffer-object frame buffer. This change will
get about 4x times performance improvemention when we use y-inverted
glamor or the driver support hardware y-flipped blitting.