Skip to content

LU-SGS cost can be twice what one would expect #2837

Description

@pcarruscag

For a 3D mesh with many tets and prisms (thus a dual mesh with many neighbors to each point) LU-SGS costs about 2 matrix-vector products.

Image

LU-SGS, as the name implies, does one pass over the lower entries of the matrix and then one over the upper entries.
So one would naively expect it to cost about the same as one product.
I think the issue is that since there are many neighbors, the CPU falsely detects (via the hardware prefetcher) that the entire matrix is being used and reads the whole thing from memory for each pass.

The ideal solution would probably be to switch to the LDU format (used by OpenFOAM), but I might try to confuse the prefetcher so it doesn't read the entire matrix to see if that can get close to the "speed of light" for this (1 product).

Metadata

Metadata

Assignees

Type

No fields configured for Task.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions