How does your accuracy compare with VLMs like ColFlor and ColPali?
We think about accuracy in 2 ways
Firstly as a function of the independent components in our pipeline. For example, we rely on commercial models for document layout and character recognition. We evaluate each of these and select the highest accuracy, then fine-tune where required.
Secondly we evaluate accuracy per customer. This is because however good the individual compenents are, if the model "misinterprets" a single column, every row of data will be wrong in some way. This is more difficult to put a top level number on and something we're still working on scaling on a per-customer basis, but much easier to do when the customer has historic extractions they have done by hand.