Refactoring Matrix.cpp/.h

Here's the place for discussion related to coding in FreeCAD, C++ or Python. Design, interfaces and structures.
Forum rules
Be nice to others! Respect the FreeCAD code of conduct!
dpoen
Posts: 49
Joined: Wed Feb 02, 2022 10:26 pm

Refactoring Matrix.cpp/.h

Post by dpoen »

Hello everyone,

I've being reading the code of the Matrix object in Matrix.cpp/.h, and it seems to me that there is room for improvement (original author did great, don't get me wrong :) )
I'm posting my thought before pushing code, once we agree on any changes, I will go ahead and make a PR.
I hope that this post in not too long...
Don't hesitate to pinpoint me any rules that I could have be missing ! (note that the reason "Because we do that everywhere else" is fine for me, I don't want to be disturbing the whole code :) )
These note are only "cleaning", I'm sure the code actually work already !
Notes come in any order of importance.

Here are the thing that could be improved, IMHO:

1) Removing unused code : in PR : https://github.com/FreeCAD/FreeCAD/pull/5494

2) Avoid code duplication in -= , +=, *= operator : we define the = operator, and the + , - , * . So instead of duplicating code from + (for instance) into +=, I think I could write instead in += something like

Code: Select all

    (*this) = (this) + input;
    return (*this);
and so on for - and *

3) There is use of short as for-loop counter. I think we could change them back to int. In fact, using shorter type don't improve speed, if it was the intent. Actually, the computer must operate additional masking to work on shorter values, rather than on datatype with regular size.
I would be happy to be proven wrong, ideally I should be running benchmark to verify my point, but still, my personal rule is that int is fine in those cases.

4) In for-loop, counter could be declared inside of the loop statement, instead of before. That reduce scope (well, that not very useful here anyway), but since C++ allows it, I'm used to doing so... Is there any rules to not doing so ? note that transpose() do it, for instance, so that would make think more homogeneous.

Code: Select all

  for (int iz = 0; iz < 4; iz++)
    for (int is = 0; is < 4; is++)
EDIT: sometimes, we use is,iz, sometimes i,j . I'm use to i,j, so to make it consistent I would use it anywhere.

5) More generally I'm fan of reducing scope of variables, so putting them on the smallest block possible. I do feel more readable as well, and we are ensure that the value is not used outside of the block.

6)
EDIT : This point will not be done, as the "best" way to do is not clear, as rediscussed in https://forum.freecadweb.org/viewtopic. ... 71#p568171
I'm also used to defensive programming pattern, where the main code (nominal case) is the least indented. For instance giving that function :

Code: Select all

PyObject* MatrixPy::row(PyObject * args)
{
    int index;
    if (PyArg_ParseTuple(args, "i", &index)) {
        index = index % 4;
        Matrix4D* mat = getMatrixPtr();
        Base::Vector3d v = mat->getRow(index);
        return Py::new_reference_to(Py::Vector(v));
    }

    PyErr_SetString(Base::BaseExceptionFreeCADError, "int expected");
    return nullptr;
}
I would write it as

Code: Select all

PyObject* MatrixPy::row(PyObject * args)
{
    if (PyArg_ParseTuple(args, "i", &index)) {
        PyErr_SetString(Base::BaseExceptionFreeCADError, "int expected");
        return nullptr;
   }
   int index = index % 4;
   Matrix4D* mat = getMatrixPtr();
   Base::Vector3d v = mat->getRow(index);
   return Py::new_reference_to(Py::Vector(v));
}
Here, we don't have to look all the way (granted this function is short, but it's just to demonstrate the pattern ;) ) to the bottom of the code just to get the error message, it should be clear that we fail "hard and now" when the parameters are wrong.
Note that it's my very personal take on that point :)

7) In transpose(), why not just swap two values in place ? That would additionally avoid read/write the diagonal, and avoid buffer and memcpy. Something like : (I might have mess up the indices, but it's just to get an idea) (exchange() could simply use a tmp variable to work, instead of a whole function call, indeed)

Code: Select all

  for (int i = 0; i < 4; i++)
  {
    for (int j = 0; j < i; j++)
      exchange(dNew[j][i], dMtrx4D[i][j]);
  }
8) in Matrix_gauss : shouldn't we format the code ?

9) Why define typedef double * Matrix; ? Could me not use Matrix4D ?
It seems like we want to perform the invertion on regular array rather than on the "same" array located in the class. Is it a performance concern ?
If we could use Matrix4D, in inverseGauss() we could remove the explicit identity initialisation, as the Matrix4D default to that.

Code: Select all

  double inversematrix [16] = { 1 ,0 ,0 ,0 ,
                                0 ,1 ,0 ,0 ,
                                0 ,0 ,1 ,0 ,
                                0 ,0 ,0 ,1 };
If we want to keep Matrix instead of Matrix, then the Matrix_identity function could be used instead (and not remove it in my first point, with the PR)

10) In inverse () : are they German comment ? anyway, should we translate them in English ?

11) The code of PyObject* MatrixPy::rotateX / Y / Z seems to be factorizable a lot : if we define a function with the whole code in the do[ ... }while(false); We could reduce the quantity of code by a lot, and remove the usage of the do{ ... } while(false) altogether !

12) int hasScale(double tol=0.0) const; define an default value for tol, but in that case, given that we do

Code: Select all

   if (tol == 0.0)
        tol = 1e-9;
in the function beginning, would not be better to simply use 1e-9 as a default value in the prototype ?
I'm aware that change slightly the API of this function. Because user that use 0 in they call have learn that the code got a protection for that.
The C++ code would not be impacted (it always call that with no parameter, so it would not be impacted), but only Python API
Given that :

Code: Select all

PyObject* MatrixPy::hasScale(PyObject * args)
{
    double tol=0;
    if (!PyArg_ParseTuple(args, "|d", &tol))
        return nullptr;
    return Py::new_reference_to(Py::Int(getMatrixPtr()->hasScale(tol)));
}
in case of no parameter, maybe initialise tol with 1e-9 ? Then again, it's kinda a duplication, because we need to keep the two values the same.
But it would allows to clean to whole optional parameter thingy on the C++ side, without breaking anything on the Python side.

13) In analyse(), we do

Code: Select all

                sub[0][0] = dMtrx4D[0][0]; sub[0][1] = dMtrx4D[0][1];
                sub[0][2] = dMtrx4D[0][2]; sub[1][0] = dMtrx4D[1][0];
                sub[1][1] = dMtrx4D[1][1]; sub[1][2] = dMtrx4D[1][2];
                sub[2][0] = dMtrx4D[2][0]; sub[2][1] = dMtrx4D[2][1];
                sub[2][2] = dMtrx4D[2][2];
which is basically equivalent to a call to subMatrix(3). But this function is defined only on the python API. Could it be a good idea to make a "subMatrix" method in Matrix.cpp, call it from the python, and call also from analyse().
That would allows usage of subMatrix elsewhere in the core code of FreeCAD.

in the same function, we do also

Code: Select all

 bool ortho = true;
                for (unsigned short i=0; i<4 && ortho; i++) {
                    for (unsigned short j=0; j<4 && ortho; j++) {
                        if (i != j) {
                            if (fabs(trp[i][j]) > eps) {
                                ortho = false;
                                break;
                            }
                        }
                    }
                }
To check for orthogonality. But this function is also implemented in the Python, would could bring it back into the cpp to avoid also code duplication.
Just a thought, however.

14) analyse() use a

Code: Select all

    const double eps=1.0e-06;
But as in hasScale(), it could be nice to get it as an (optional) parameter, with default value at 1.0e-06

15) analyse() use a weird mix of std::string text; and std::ostringstream stringStream; to format the output message with .str() everywhere to convert.
I think follow the pattern of toString() could be nice :
Use one ostringstream, and then convert back to string with .str() just before return (moreover analyse() as only one return already, should be easy enough to use that pattern)


Thank for the readers that arrived here ! :)
Feel free to give me any feedback, I would definitely not be offended if I have to discard my thought :D , Especially because I could be picky on the code, and some(all?) notes could be considered as no real improvement

Cheers,
Last edited by dpoen on Fri Jun 17, 2022 8:37 pm, edited 3 times in total.
User avatar
doia
Posts: 251
Joined: Sat May 29, 2021 5:47 am
Location: Düsseldorf

Re: Refactoring Matrix.cpp/.h

Post by doia »

Wow, deep analysis of this code. I have to admit I do not understand the code as I'm not familiar with matrix manipulation. Just let me add my 2c.

4) Agreed, keep the scope of the counter as small as possible.

6) Yes, "Fail fast and return early" should be a general rule. A function runs to the end only with the happy path, errors exit earlier.

8) Indeed, that's a for-nesting hell. I counted 5 levels. This should be a) formatted and b) split up and c) block of important steps should be commented.

And I'd like to add: write a doxygen style comment before EVERY function to document the code with a short description of its intended behavior. This would greatly help in understanding. Some might say "the code documents itself", which is never the case.

Thank you for starting this refactor process.
dpoen
Posts: 49
Joined: Wed Feb 02, 2022 10:26 pm

Re: Refactoring Matrix.cpp/.h

Post by dpoen »

Thank for your feedback !

I do agree with you, on all point.

I don't know however how FreeCAD ' doygen is formatted, I need to find example before writing it. Are there any reference ?

And for 8) I might need some advice along the way to ensure that I properly understand this inversion code, I've worked with stuff like that in the past, but we know how it goes when the memory is not refreshed often enough :D

I will wait some time to gather others feedback, and then I will proceed to implement.

BTW : Should I make multiple PR or only one with multiple commit ? (or even one big commit, I don't know the git usage on the FreeCAD repo :) ) I might as well continue with the PR I've already opened.
User avatar
Kunda1
Veteran
Posts: 13434
Joined: Thu Jan 05, 2017 9:03 pm

Re: Refactoring Matrix.cpp/.h

Post by Kunda1 »

Thanks for the focused work, analysis and proposal! I'm a non-coder so it's out of my wheel-house.
As for anything doxygen, check out the wiki doxygen page.
Alone you go faster. Together we go farther
Please mark thread [Solved]
Want to contribute back to FC? Checkout:
'good first issues' | Open TODOs and FIXMEs | How to Help FreeCAD | How to report Bugs
openBrain
Veteran
Posts: 9034
Joined: Fri Nov 09, 2018 5:38 pm
Contact:

Re: Refactoring Matrix.cpp/.h

Post by openBrain »

dpoen wrote: Fri Feb 04, 2022 12:09 pm I've being reading the code of the Matrix object in Matrix.cpp/.h, and it seems to me that there is room for improvement (original author did great, don't get me wrong :) )
I read all remarks and actually for most I don't care. :) But in any case, one should think to readability/maintenance when it has to be weighed against performance.
But as foreword, pay attention to not open too much things. You IMO should limit yourself to only a few things and finish them. ;)
2) Avoid code duplication in -= , +=, *= operator : we define the = operator, and the + , - , * . So instead of duplicating code from + (for instance) into +=, I think I could write instead in += something like

Code: Select all

    (*this) = (this) + input;
    return (*this);
and so on for - and *
Do not mix pointer & values :) (*this) = (*this) + input;
6) I'm also used to defensive programming pattern, where the main code (nominal case) is the least indented. For instance giving that function :

Code: Select all

PyObject* MatrixPy::row(PyObject * args)
{
    int index;
    if (PyArg_ParseTuple(args, "i", &index)) {
        index = index % 4;
        Matrix4D* mat = getMatrixPtr();
        Base::Vector3d v = mat->getRow(index);
        return Py::new_reference_to(Py::Vector(v));
    }

    PyErr_SetString(Base::BaseExceptionFreeCADError, "int expected");
    return nullptr;
}
I would write it as

Code: Select all

PyObject* MatrixPy::row(PyObject * args)
{
    if (PyArg_ParseTuple(args, "i", &index)) {
        PyErr_SetString(Base::BaseExceptionFreeCADError, "int expected");
        return nullptr;
   }
   int index = index % 4;
   Matrix4D* mat = getMatrixPtr();
   Base::Vector3d v = mat->getRow(index);
   return Py::new_reference_to(Py::Vector(v));
}
There are pros & cons. For myself, I like to have the "default/fallback" value at the end, so if someone want to add another "if" statement that would process another case, it's not a mess. Imagine your 'if' statement if the code is then processing several situations...
I'd vote against this, but if you do, please think about negating the 'if' test. Here you just broke the code. ;)
12) int hasScale(double tol=0.0) const; define an default value for tol, but in that case, given that we do

Code: Select all

   if (tol == 0.0)
        tol = 1e-9;
in the function beginning, would not be better to simply use 1e-9 as a default value in the prototype ?
IMO, the best would be to use the standard Precision::Confusion
dpoen
Posts: 49
Joined: Wed Feb 02, 2022 10:26 pm

Re: Refactoring Matrix.cpp/.h

Post by dpoen »

@Kunda1, thanks for the link

@openBrain
But in any case, one should think to readability/maintenance when it has to be weighed against performance.
To be fair, my proposal don't change anything on performance, because it doesn't change the actual code executed :) Those change are definitely to increase readability, and remove duplicate code (less code less bug :) )
You IMO should limit yourself to only a few things and finish them.
You right, I will try to work on little piece or code at once.
Do not mix pointer & values :) (*this) = (*this) + input;
Let's say it's a mistype :D
There are pros & cons. For myself, I like to have the "default/fallback" value at the end, so if someone want to add another "if" statement that would process another case, it's not a mess. Imagine your 'if' statement if the code is then processing several situations...
I'd vote against this, but if you do, please think about negating the 'if' test. Here you just broke the code.
Ok, since it's not clear if their is a "best" to do it, I will leave it like that, at least in the first time.
Edited the original message
IMO, the best would be to use the standard Precision::Confusion
Ok, I'll work with it.
openBrain
Veteran
Posts: 9034
Joined: Fri Nov 09, 2018 5:38 pm
Contact:

Re: Refactoring Matrix.cpp/.h

Post by openBrain »

dpoen wrote: Fri Feb 04, 2022 1:00 pm BTW : Should I make multiple PR or only one with multiple commit ?
One PR should be OK. To me a rule of thumb is to not mix Refactoring with Bugfix with Feature (improvement/creation).
Number of commits is up to you, but each commit should be atomic (i.e. should be "pickable' as a single modification).
dpoen
Posts: 49
Joined: Wed Feb 02, 2022 10:26 pm

Re: Refactoring Matrix.cpp/.h

Post by dpoen »

:thumb_up: @openBrain :)
wmayer
Founder
Posts: 20243
Joined: Thu Feb 19, 2009 10:32 am
Contact:

Re: Refactoring Matrix.cpp/.h

Post by wmayer »

2) Avoid code duplication in -= , +=, *= operator : we define the = operator, and the + , - , * . So instead of duplicating code from + (for instance) into +=, I think I could write instead in += something like
The downside is that this always requires a tmp. matrix object which affects performance.
if (PyArg_ParseTuple(args, "i", &index)) {
PyErr_SetString(Base::BaseExceptionFreeCADError, "int expected");
return nullptr;
}
which is the opposite of the original code. If you use an int it raises an error that an int is expected.
in the function beginning, would not be better to simply use 1e-9 as a default value in the prototype ?
Yes, the default should be 1e-9 (or any other small value) because this allows to use 0.0 if needed.
dpoen
Posts: 49
Joined: Wed Feb 02, 2022 10:26 pm

Re: Refactoring Matrix.cpp/.h

Post by dpoen »

which is the opposite of the original code. If you use an int it raises an error that an int is expected.
Yes excuse me I've just missed that :) I will edit my original message
The downside is that this always requires a tmp. matrix object which affects performance.
Oh I see, didn't expect that !
Yes, the default should be 1e-9 (or any other small value) because this allows to use 0.0 if needed.
Fine, and as proposed by openBrain the standard Precision::Confusion should do it.
Post Reply