There are points denoting the performance of three systems AR, FR and BMN. For instance the FR has system has a precision of 65% and a recall of 85%. According to the F1-score:
* even if BMN is very close to AR it has a better F1.
* even if FR has the same precision as AR, its F1 score (green zone) is far from the magenta zone of AR.
