You make a very good point. The distance per stroke is not the whole story. The amplitude and effort that goes into the stroke is obviously just as important (along with drag details like speedskin vs. wetsuit, or bathingsuit).
The dilemma is that we cannot directly see the effort (i.e. - energy or work) that goes into a fin stroke. The closest observable is the velocity and acceleration profile of the fin stroke. A smooth sinusoidal velocity profile will have a more efficient distribution of vorticity throughout the stroke cycle, but a profile with sharp impulses and abrupt velocity changes will generate stronger vortices and have more energy and total thrust embedded in its motion. But, without knowing the back pressure or resistance of the fin, the work being done by the swimmer is unknown.
So, the velocity profile cannot tell us the actual total effort. It can only indicate if a particular fin is being pushed to it's thrust limits or if it's just being used normally and within its smooth and most efficient operating range.
The amplitude of the fin stroke and distance traveled per stroke are related to the fin's thrust capability and the drag of the swimmer. The amplitude and velocity profile of the fin stroke are related to the effort that is being exerted. Though it is not the whole story on it's own, that information combined with a "this is easy and comfortable feeling fin", or a "this fin is a heavier load on the feet than expected" and it can start to paint a meaningful picture of what's going on.
Without the opportunity to actually use two different fins in a careful evaluation of side-by-side performance, looking at these indicators is often about the best we can do, and with enough pieces of information quite a bit of relevant performance can sometimes be deduced. If there are persistent differences in lap times and/or differences in stroke count, differences in the velocity profile of the kick stroke and descriptions of differences in the back pressure, these indicators can often be used to uncover a performance gap between two monofins, if that gap is large enough. If the two fins are close in performance, the relatively small performance gap will probably not be resolved by this method (at least not legitimately).
Short of purchasing a Specialfins or Nemo to be able to test side by side, I thought it worth looking at the visual evidence to see what can be learned. So far, my finding has been that there is a big difference, but that's based on the videos I've been able to find which don't show the Nemo being used with a reasonably good technique, and without that, the big difference has little meaning and is probably not worth noting. There is just too much uncertainty. By the time divers get good technique, I think they often move on to swimming with a hyperfin, so there is just not a lot of video evidence on youtube to draw from that I've been able to find.
At this point I find it difficult to even swim with a Nemo - having grown accustomed to my hyperfin. I will say that, for me, it is more difficult to transition back to the Nemo than to Orca or the X-22. Having spent a couple of years using a Nemo (which I think is an excellent fin btw) I have a high order of confidence that the x-22 - and the pilot - will exceed it's performance* for a diver with good technique with both fins. - *By 'exceed it's performance' I mean energy for distance.