
Post by trejkaz on Feb 28, 2022 0:13:42 GMT
So I ran across this apparently wellknown algorithm: en.wikipedia.org/wiki/LU_decompositionCalculating inverse of a 4x4 matrix, best times for 10,000 runs: Algorithm in the book: 61.3729 ms Using LU decomposition: 12.0178 ms Not too bad! In the grand scheme though, it only shaves a second off a render time of ~ 40 seconds.



Post by Jamis on Feb 28, 2022 23:58:04 GMT
Thanks for pointing that other algorithm out! Ultimately, it came down to which algorithm I felt I could describe and decompose sufficiently in the space I had, such that it could be implemented testfirst.



Post by trejkaz on Mar 22, 2022 6:51:23 GMT
I was talking with a friend about this one and they reminded me that at school we were taught a different way to calculate determinants, which might also be fairly performant for 4x4 matrices, but I haven't tried coding it up yet:  Set resultSign to 1
 If the top left cell is a zero:
 If everything below is zero, then your determinant is zero, so you can return 0
 If something below was nonzero, swap the rows and flip the sign on resultSign
 Now that the topleft cell is nonzero, for every row below which was also nonzero, add or subtract the top row until you get 0
 Repeat for the next diagonal. (Could even use recursion still, if you wanted to get the 2x2 case right first and then work up to 3x3 and then 4x4.)
 If you get to the end without returning 0, compute the product of the cells on the diagonal and multiply by the sign, and bam, there's your determinant.
But actually, the reason why I was looking at faster ways was that I wanted to compute the determinant of a 200x200 matrix and it was taking minutes to run.



Post by bit101 on Oct 23, 2022 15:29:26 GMT
As the inverse and transposed inverse are directly based on the base transform and will not change unless that changes, I just made a setter to set the transform rather than a property that can be set directly. The setter sets the transform, then calculates the inverse and transposed inverse and saves them. I make sure that any code that affects the transform uses the setter. Then I'm only calculating the other properties one, rather than every time they are accessed. It saves a ton of time.



Post by bit101 on Oct 23, 2022 23:36:02 GMT
As the inverse and transposed inverse are directly based on the base transform and will not change unless that changes, I just made a setter to set the transform rather than a property that can be set directly. The setter sets the transform, then calculates the inverse and transposed inverse and saves them. I make sure that any code that affects the transform uses the setter. Then I'm only calculating the other properties one, rather than every time they are accessed. It saves a ton of time. And now that I've read the forum a bit, I see that I'm about the 10,000th person to make this discovery.



Post by vorpal on Jan 27, 2023 19:02:16 GMT
Calculating the determinants and storing them (I am working in Kotlin and using lazy initialization on them as properties) is a pretty negligible part of the rendering. I think that minimizing the matrix multiplications and using a faster multiplication algorithm will probably speed things up significantly.
A kd tree on triangular meshes sped things up dramatically, but it doesn't work in all cases.



Post by trejkaz on Apr 4, 2023 22:02:51 GMT
The usual tricks to speed up multiplication assume that every multiplication between two values is one operation. They assume you will have n³ multiplication ops. Then they go on to say that a 4×4 matrix multiplication only needs 47 multiplications. Compared to 4³ = 64, that's 73% of the time. But wait, when I multiply matrices, I use vector operations to do the dot product. So I'm already running 34 times faster than the naïve case. So I'm not sure these "smart" algorithms for multiplication will help me. At best, they will cause me to have to do new research to figure out how to apply vector operations to their new, more complex algorithm. What would help my situation immensely would be matrix operations on the CPU. (That and, a faster way to do cross product. Because when I profile my code, that is where all the time seems to be going right now.) As an additional note on vector ops:I tried to make a library where you just fed it multiplication operations to perform and it packed them into vectors for you. But it turns out, if you try to do this, the extra time copying the values around costs so much that there is no benefit. So the only way to profit from vector ops seems to be to use them to store your raw values. Thus, my "tuple" type is wrapping a DoubleVector. My matrix, though, is still wrapping an ImmutableDoubleArray. So there might be some slight wins to converting the matrix to an array of DoubleVector. In saying that, to do that, you'd have an interesting time choosing whether to make it row major or column major. Intuition says, storing it column major might actually be more efficient, since that's what modern graphics APIs appear to have chosen. But for multiplying a matrix by a vector, row major is obviously going to be faster. Store it in both ways?



Post by trejkaz on Apr 4, 2023 22:13:29 GMT
As the inverse and transposed inverse are directly based on the base transform and will not change unless that changes, I just made a setter to set the transform rather than a property that can be set directly. The setter sets the transform, then calculates the inverse and transposed inverse and saves them. I make sure that any code that affects the transform uses the setter. Then I'm only calculating the other properties one, rather than every time they are accessed. It saves a ton of time. You're assuming I'm only using matrices for transformations, but I already said I was dealing with a 200×200 matrix, which should be a hint that I'm using them for other things. I do also only need to compute this inverse once. The issue was that the algorithm in the book takes several years to run, and I had a limited amount of time to wait for my renders.

