Probe Software Users Forum

Hardware => Cameca => Topic started by: Ben Buse on October 08, 2025, 01:09:39 AM

Title: reading CAMECA files without software
Post by: Ben Buse on October 08, 2025, 01:09:39 AM
Hi

Does anyone have a script that will read CAMECA files and export to more useful format, the wavescan files wdsdat and the quant data files.

It seems it will be important going forward as more CAMECA instruments end, being able to access old data, without maintaining old computers.

Ben
Title: Re: reading CAMECA files without software
Post by: sem-geologist on October 08, 2025, 03:04:53 AM
TLDR;
script for export? no
library for reading? yes (it depends, you can read things, but will need to figure out how to export into form which you want)

Their formats are binary where structure is dynamic. A Script - that would be absolutely too simple for such extremely difficult task. I had reverse engineered formats to some extent and defined them using KaitaiStruct: website of the RE tool kaitai.io (https://kaitai.io/);
The repository for the parser description in kataistruct code is here:
https://github.com/sem-geologist/peaksight-binary-parser (https://github.com/sem-geologist/peaksight-binary-parser)

Kaitai struct code can be used to inspect single files in kaitaistruct webide (kind like very advanced hexeditor), but is inpractical on its own. Kaitai struct can be translated into many different programming languages, making a parser library.

To look to some practical usage of such parser made for python, look here:
https://github.com/sem-geologist/HussariX (https://github.com/sem-geologist/HussariX)

BTW, what do you mean the "end"? These instruments are one of the most repairable and upgradable machines on the market.
Title: Re: reading CAMECA files without software
Post by: Ben Buse on October 08, 2025, 07:46:58 AM
Thanks I'll take a look at that. I've just been modifying  https://stackoverflow.com/questions/31410043/hiding-lines-after-showing-a-pyplot-figure (https://stackoverflow.com/questions/31410043/hiding-lines-after-showing-a-pyplot-figure) to plot a folder of exported wavescans (which I'd exported from cameca software), allowing selection of lines from legend,
Title: Re: reading CAMECA files without software
Post by: sem-geologist on October 10, 2025, 12:41:25 AM
Quote from: Ben Buse on October 08, 2025, 07:46:58 AMThanks I'll take a look at that. I've just been modifying  https://stackoverflow.com/questions/31410043/hiding-lines-after-showing-a-pyplot-figure (https://stackoverflow.com/questions/31410043/hiding-lines-after-showing-a-pyplot-figure) to plot a folder of exported wavescans (which I'd exported from cameca software), allowing selection of lines from legend,

I think I get what you try to achieve, and I think HussariX would let you do exactly that straight from the binary files without any export. Let me know, or PM me if you need some more help with setting it up (it, unfortunately, has still no installer, but as You are already familiar with python scripts, setting up the libraries should be not very difficult for you. Every time I get some spare time and can sit down to continue my software work, yet another hardware problem on one of two EPMA or its peripherals appears taking away all my time).
Title: Re: reading CAMECA files without software
Post by: Ben Buse on October 14, 2025, 03:07:10 AM
Ok, I've done the easy part followed your instructions, read the CAMECA wavescan file, and printed the comments for each sample, which works great, now for the hard part, work out how to extract the x and y values, any ideas

dts = parsed_data.content.datasets
# if we want to print list of datasets with its setup name and comments:
for i in dts:
  print(i.header.setup_file_name.text, i.comment.text)
wavescan over beryllium4.wdsSet Beryl sk-2
wavescan over beryllium4.wdsSet Beryl sk-1
wavescan over beryllium4.wdsSet Chrysoberyl
wavescan over beryllium4.wdsSet b7 al2o3
wavescan over beryllium4.wdsSet cstd1 Quartz
wavescan over beryllium4.wdsSet Phenakite
wavescan over beryllium4.wdsSet herderite
wavescan over beryllium4.wdsSet b7 durango apatite
>>> i
<cameca.Cameca.Dataset object at 0x000001C8A6C757E0>
>>> i.__dict__

>>> i.__dict__.keys()
dict_keys(['_io', '_parent', '_root', 'header', 'items', 'comment', 'reserved_0', 'n_extra_wds_stuff', 'extra_wds_stuff', 'has_overview_image', 'polygon_selection', 'overview_image_dataset', 'polygon_selection_type', 'is_video_capture_mode', 'reserved_1', 'reserved_v17', 'image_frames', 'reserved_2', 'overscan_x', 'overscan_y', 'dts_extras_type', 'extras'])
Title: Re: reading CAMECA files without software
Post by: Ben Buse on October 14, 2025, 06:20:12 AM
And here opening up a quant point file

I've found can extract list of elements, but yet to find weight percent values

#prints atomic number of elements
for x in range (0,len(i.items)):
    print(i.items[x].signal_header.element.atomic_number)

This also contains
i.items[1].signal_header.__dict__.keys()
dict_keys(['_io', '_parent', '_root', 'element', 'xray_line', 'order', 'spect_no', '_raw_xtal', 'xtal', 'two_d', 'k', 'reserved_0', 'hv', 'beam_current', 'peak_pos', 'counter_setting'])

So parameters can be extracted as
for x in range (0,len(i.items)):
    print("z:", i.items[x].signal_header.element.atomic_number, "line:", i.items[x].signal_header.xray_line, "order:", i.items[x].signal_header.orde

Interesting FILL is LLiF and TEPL is LPet
Title: Re: reading CAMECA files without software
Post by: Probeman on October 14, 2025, 08:31:04 AM
Quote from: Ben Buse on October 14, 2025, 06:20:12 AMInteresting FILL is LLiF and TEPL is LPet

Maybe because the byte order is reversed on Motorola devices compared to Intel?  Big-endian vs. Little-endian:

https://en.wikipedia.org/wiki/Endianness
Title: Re: reading CAMECA files without software
Post by: sem-geologist on October 14, 2025, 09:05:31 AM
Reverse engineering of technology is kind like archeology or geology, where we see some final result and we try to interpret how, and why it is we see it like it is, and not the other way which we would expect.
And so indeed I also thought what was with this reverse Xtal naming convention inside binary files. I see few usage cases which simplifies identification. To add some detail here we need to consider that modern Cameca Peaksight works on Windows on x86 architecture (even if installed on X86_64 , the 64bit OS) and use little-endian for persistent data storage on the disk, same as OS and architecture. Why does it matter? well Xtal abbreviation is stored a bit differently to other textual strings in these files – it is stored as fixed size of 4 bytes (so 4 letters). XTAL string could also be cast into uint32 integer and used as kind of Enum. Moreover, because it is written in reverse it is easy to check xtal family by checking 3 bytes from 4, or with bit mask. Compare these `LPET` and `PET\x00`: index-wise they are absolutely different textual strings. Now if we reverse `TEPL` and `TEP\x00` - tada - first 3 bytes are the same. And in case if we would deal with it as cast into integer identificator, simple ultra fast bit shift `>> 8` can be used for family-wise filtering. It is more elegant that way. And so it is left as that, as it is easier to use it for data sorting according to XTAL family if needed.

Now for data access - It's very "easy", it sits down in some deep data structure. So if your dataset is as variable `i` you can access it:
i.items[index of WDS in dataset].signalthe signal will have further nodes depending from the type. For wds scan data you can get raw y at .signal.data.bytes - which is raw bytes. (Hopefully You had read the introduction at my kaitai code about limitations of Kaitai, and what is recommended to do at target language).
There is an easier and more pleasant way. You could copy wrapper for python from HussariX project, it overrides some of kaitaistruct types and sanitizes them for easier use with python.
In particularly this file: https://github.com/sem-geologist/HussariX/blob/master/lib/parsers/cameca.py
It won't work, unless getting rid of `Element` notation which it uses from other file, or simply copy whole class of Element into that file (The Element class is very small class to use interchangably element abbreviations and atomic numbers, and check by one or the other which of element it is; I know there are some bloated libs including that functionality, but I used in HussariX something minimal, and with a better layout for performance). With that wrapper a lot will get easier for WDS, 3 different x and y scales can be made (in nm, in keV, or 100ksintheta; cts, cts/s, cts/s/nA); and most important of them not as bytes, but as performant numpy arrays. Also all textual values no more need to use .text, but can be seen and used directly. It also reads the images from impDat. And do that lazily (wont dump all huge content into memory, but only get the data which is asked to be opened.

Furthermore, I think You insist to use pyplot. I was trying initially to do the same, but abandoned matplotlib for that purpose completely as it really performs terrible for interactive work. It is excellent for publishing plots, but for interaction (adding, removing lines, rescaling, highlighting, changing perspective from lin to log and etc etc.) it really is slow and painful. Thats why HussariX use pyqtraph. It is also much easier to interface with GUI as it is made in Qt and use it for very fast plotting.

As for wt% You will find it also in .signal branch which for qtiDat will look quite different. Unless there is some wt% in wdsDat - which is then bad luck, as I never used such Peaksight functionality and had not took it into my RE attempt.
Title: Re: reading CAMECA files without software
Post by: Ben Buse on October 15, 2025, 01:58:57 AM
Thanks for your reply

So I took your linked file, copied element into it, and changed
from .cameca_ks.cameca import Cameca, KaitaiStream
to
from cameca import Cameca, KaitaiStream
Renamed as cameca_read3

Then ran

import cameca_read3
>>> cameca_read3.WdsScanSignal(i.items[0].signal.data.bytes)


with result

Process ended with exit code 3221225725.
The key is how to decode the bytes...
b'\xc3\xc8\x8fB\x02\x88\x89BB\xe9\x92B4\n\x9bBI)\x94B8\x07\x82B}J\x9cB\xde\x8b\xa7BVl\xa9BT\x10\xc3B\x19\xb4\xd7B\xd6\x12\xd2B#\x95\xdeB\x00\xf4\xd8B\xe6\x8e\x04C\xa4\xbb\xffB\x81{\xfeB[\xfb\xfbB\xc1\x10\x0cC\xed\xf9\xf6B&\x9f\x03C\xee\x00\rCv^\x02C\x99p\x0bC \xcf\x05C%\xe0\tC\xa1\xc0\x0bC4\xa2\x12CD\xe1\x0eC~\x92\x13C\xf4P\rC+\xd4\x19C\xd8\xe3\x18C\x8b\x18*CH\xb7%C\xa2\x9a1CI\n0C\x06\xdb2CaO@C&\xff?C\xe6\xd3LC\xff\t^CB\x9cdC\xadu{C\x13\x16\x85CB\xe4\xa1CdA\x9cC\xd8O\xa8C\xe6\x9c\xb0CA\xde\xbeCU&\xd7C\xde\xf1\xf9CQ\xe8\x11D\xd9\xc5"D\n\x0c<D\xe6HZD\x88\\sD\xe5\x9d\x90D\x1e\xb0\xb3D_\x08\xd5D\xb7i\xf1D\x99X\x02E\xb0\xf7\xfdD\xecj\xf1D\xe8O\xf3D\xa2,\xf1Df\xf2\xfeD\n(\x03Ep\xc8\xffD\xdf/\x06E\x18\xf9\x0bE\x12\xa5\nE\x99j\x07E\x93\x02\x00E\xfd%\xffDlH\xefD\x85\xf2\xd8DV\xc5\xbeD\xd5\xc6\x9fD\x848\x84DV\x86lD\x02\x0fOD\x89\xd38DOz\x1fD\xd8a\x0eDH\xb6\xfdC[\x92\xe5Cc:\xb9CiZ\x9dChd\x92C\xe4olC\x98\xe1FC\xa4x)C\xcc>\x04C:\xd8\xeeB\x05\x13\xd2B\rr\xccB\xe8.\xb7B\x1ap\xc2B;\xcc\xa8Bvk\xa4B\x9b-\xb2Bj\xad\xafBvk\xa4BR)\x94B\x16K\xa1B\xae\xa9\x96B\xf6K\xa6B\x17\xca\x99B\xcaI\x97Bi\xcb\xa3B\xb4,\xadB\xdch\x90B\x14*\x99B#\xcc\xa8B\x18\xca\x99B\xf5)\x99B\xf5h\x90B\xef)\x99BvH\x8dBJ)\x94B\xe3\x89\x98B(\x07\x82B\xfeG\x88B\xdeLvBB\xa7\x82B\x1e\xca\x99B4\x89\x93B\xf6\x08\x91B\x87\x0bfB\xb0\xa9\x96Bl\x87\x84B\xb6\x07\x87B\x0b\xc7\x80B\xe0g\x86B\xc8\xccsB\xe4h\x90B\xfc\xe7\x88Bu\x08\x8cB\x98\x87\x84B\x0eLlB4INB"\xe8\x88B\xa9\xc7\x85B2\x07\x82B\xbc\x0cuB\xc9\xa7\x87B\xb7\xccsBC\x07\x82BGh\x8bB\xb1\xa9\x96B)\xcdxB\xfb\x8bmB(\xcaZBX\xa7\x82B\xba\xc9UB\xb0\xcd}Bh\x8dwB\xb6\x8bhB\xf4&\x80B\xbeKgBH\xa7\x82B\x06\rpB\x92\xcbiB\x07\'\x80BY\tRBO\rzBP\xa7\x82BT\rzBn\x8crB^J]B\xf4\naB\xb8\xccsB]\x89OB\x92\xccsB\xce\naB\xde\x89TB\xa3J]B6\xc9PB\x8e\tRB\xb2\xcbiB\x9e\x89TB\xafISB\xe4\x88JBz\xc8FB]\x0bfB\xc3\xccsB\xaf\xca_B+MvB\x08KbB\xa8KgB\xf7\x08MB\x8a\x8a^B\x06KbBK\x88EB\xf6IXB\x0eMvB\xd6\xc9UBuJ]B`\n\\B\x8d\x89TBw\x88EB\xa3\x89TB\xce\x89TBG\xcbdBWKgBXM{B\x19\x8bcBD\x0cpB\xa8HIBkKgB\xcc\x0bkB}\xc9PB\n\tMB7\x0e\x7fBcLqB\xd7KlBN\x08CB\xaeHIB\x0e\x8cmB \x08CB\x1b\xc7\x80BF\xc9PBcJ]BbJ]B\xb4\x88JB\xa4\xcbiB\xd8JbBB\x07\x82BoM{B\xaf\x0bkB\xb0\x0bkB\xc0\naBuG\x83B\xe7G\x88B7\xe8\x88B\xad\x87\x84B\xec\x8b\xa7B\xd4I\x97B\x8bk\xa4B\xac\xb3\xd7B\x859\xf3BHv\xe5B\xb0V\xe7BL\x92\xcfB\x1c\x99\xf2BLw\xeaB]\xdb\xfdB\xfb\x97\xedB\xc6\xb4\xdcB\x08\x95\xdeB\x00\x95\xdeB!6\xe4BLu\xe0B\x8as\xd6B-\xb1\xc8BVM\xb0B\xae\x8f\xc0B.\xca\x99B\x80\xac\xaaB\xdc\n\xa0B\xc6\xcd}B\xf4\x07\x87B=h\x8bB\xc4\naBZ\xcdxB\xa1\xccsB\xcc\x0bkB\xa4\xccsB\xc2\xca_B\xc9\tWB\xc6\r\x7fB8\xc8AB\x03\ruB\xff\x89YB\xeeIXBf\n\\BD\rzB/\x89OB\xfbIXB<\xcaZB\xc9\x0bkB\xcd\x88JB8HDB\x1d\xcaZB~\x08HB\xd4\x88JB)\x8cmB\xae\x89OB\x9e\x08HB\xa5\x07>B\xa3\xca_B.MvB\x02\'\x80B\xf6\x87@B5\tMB\xb5\x8d|B\x1aINB\xb0\xcd}B\x98\x89TB\xf9\x8bmB\x9e\xccsBC\x88EB+\x89OB[\x88EB\x92\xc9PB\nJXB\xb2\x0bkB\x0e\x8dwB\x80\x8a^B\xee\x8a^B\x81\x87;B\xd2\tWB\xde\x87@B\x0e\xccnB<\xc77B.\xcaZB8\xc8AB\x17\xcb_B\xa1J]BN\x8aTB\x8aISBy\xc8FB\xa6\x8aYB\xff\x08MB\x1c\x0baB\xb4\x08HB}\x8a^B\x1a\x876B\xe6\tWB\x83\tRB\xd3\x89TB\xa9\xca_B6\x08CBv\xcaZBBINB$\x89OB\xe6IXB\x04JXB"\xcbdBq\tRB\xb7\x08HB\xc0\x88JBc\x89OB\x99\xcbiB%\tMBo\tRB\x0b\x8aYB,\x89OB\x9e\x89TBsISB\xc4\x89TB\x07INBh\tRB\x04\xc9KB`\xc8FB*\xccnB\nJXB\xb0KgB\xdaKlB\xac\xca_B\xcc\naB\xc0HIB\xb8\naB\x02\x0baB{\x87\x84BD\n\\B\xa5\x0bfB\xba\x0cuB\xfaG\x88Br\rzB\xa1\x8d|B\xd7\xe9\x97BT\x08\x8cB$\xcc\xa8B\xc2m\xb3B`\x8e\xb6BNo\xbdBn\xb0\xc3B\x90\xf0\xc4BR\xb0\xc3B\x80\x10\xc3B^3\xd5B\x0eq\xc7Bs\xf2\xceB[\xf1\xc9B]\xcb\xa3B\'\xcc\xa8B\x9b\xed\xb0B\xe8j\x9fB\xd5\xca\x9eBz\xcb\xa3BR\xaa\x9bB)\x8bcB\t\x8cmB\x9cJ]B\xd4\xc9UB\x04\x8bcB\xf2\x0bkB\xd7\xc9UB\x8b\x87;B\xa3\x8a^B\xf6\x88JB\xe2\x87@B\xc6G?B2\xcaZB=\xc9PB\x0c\x8aYB&\xc77B\x8bISB\x8c\x08HB2\x0bfB\x8c\x89TBa\xc8FB\x16\x88@B\xda\xc9UB\x9c\x87;B\xa9\tRB\x18G5B\xc1\xc9UB\xc0\xc62B\xdf\x8a^B\x84\n\\B\xb2\x08HB\x8f\xc7<Bv\xc77Bp\xc8FB \x08CB\xbd\x07>B|\x08HB\xa1\x08HB\x06\xc8ABh\xc8FBJ\xc77B\x15\x08CB\x8c\xc7<B\r\x88@B\xfb\x08MB\xb0\xc9UB\xcb\xc62B\x89\x87;B\xf3F5B(\xc77B8H?B\xdc\x861B\xe4\tWBl\x079B"\tMB\x81\x079Bn\x87;B\x19\x074BM\x86,B5\x08CB{\xc6-B\x82\xc77Bt\x87;B\n\xc8ABVG5B\xf2\xc7AB\xd5\x05%B\x0b\tMB\x8b\x8a^Bh\tRB\xaa\xc7<Bc\x86,B,\x08>B\xc8\xc62B\x88\xc6-B\xb1\x07>By\xc9PB\xec\xc62B\xd8\x05%B\xc0G?B\x11\xc72B\xc9G?BO\xc77B\xb1\x07>BN\xcaZB<HDB\x83\x06/B\xe1\x04\x16B\x1e\x08CB\xa0F0B\xccHIB\x93\x06/B\xccF0BxHDB\xc5\x861B}G:B8\tMB\xe6E&B;\xc77B\x84\x08HB\xa8\xc7<Bp\x87;B\xdc\xc77B\xb9\x861B\x0b\x876B\x1a\xc6(B"\x05\x1bB\x14IIB\x91\xc8FBp\x079BfG:B\xed\x85"B:\x04\x0cB{\x87;BhF+BjHDB\xd4HDB\xa4\xc7<B\x85\x86,B\xb1\xc7<BP\x876B8\x06*B\xf4\x064B\xbb\x07>B\x94HIB1\x08CB0\xc6(B\xa2F0B}\x06/B \x05\x1bB)E\x1cB\'\x06*B\xc5\x07>BN\x86,B\xe7\x05%B\xc8ISBk\xc6-B\xec\xc8KB\xfa\x85\'B3G5B\xfb\x87@B\x90\x06/B\xd0\xc5#B\xef\x85\'B\xaaD\x12B\xb5\xc5#B\xbb\xc4\x14BU\x85\x1dB\x19\x88@B\xaa\x06/B\xc1\x87;BM\x86,B\xbe\x07>B\x8c\xc5\x1eBV\x079B\xe3E&Bf\x06*Bk\xc77B\x95F+B\x8eD\x12B;\xc77B\xb2\xc7<B\x08\xc5\x19B\xb4\xc5#B\xbe\x85"B\xaf\x88J...
Title: Re: reading CAMECA files without software
Post by: Ben Buse on October 15, 2025, 08:49:06 AM
Ok if I use line

np.frombuffer(i.items[0].signal.data.bytes, dtype=np.float32)
based on your code

I decode the bytes and I get the intensity column for the wds spectra in values rather than bytes and plot them! Success!

i.items[0] being the intensity values for the first spectrometer scan e.g. LPET on 1 on that sample,
i.items[1] the next crystal scan e.g. LTAP on 2, etc...

And xvalues determined as
xvalues = list(range(i.items[0].signal.wds_start_pos,i.items[0].signal.wds_start_pos+(1000*int(i.items[0].signal.step_size)),int(i.items[0].signal.step_size)))
Title: Re: reading CAMECA files without software
Post by: Ben Buse on October 20, 2025, 09:00:09 AM
For completeness here's reading a qtiDat file with quant data into a pandas data frame

#test for rows within sample
df = pd.DataFrame(np.nan,index=range(0,len(dts)),columns=range(0,len(dts[0].items)))
dfextras = pd.DataFrame(np.nan,index=range(0,len(dts)),columns=["comment","sample_no","row_no"],dtype="string")
colatnum = []
rowcounter = 0
#for each sample
for i in range (0,len(dts)):
    #for each row in sample
    for ii in range (0,len(dts[i].items[0].signal.data)):
        #for each element in each row
        for iii in range (0,len(dts[i].items)):
            #write once the element atomic number to column names
            if i == 0:
                colatnum.append(dts[i].items[iii].signal_header.element.atomic_number)
            #add once extra row to datafile for each extra sample row
            if ii > 0:
                if iii == 0:
                    rowcounter = rowcounter + 1
            #store wt% in cell
            df.at[rowcounter,iii]=dts[i].items[iii].signal.data[ii].oxide_fraction
        #write additional columns
        #comment column
        if dts[i].comment.text == '':
            dfextras.at[rowcounter,"comment"]="None"
        else:
            dfextras.at[rowcounter,"comment"]=dts[i].comment.text
        #sample number column
        dfextras.at[rowcounter,"sample_no"]=str(i)
        #sample row column
        dfextras.at[rowcounter,"row_no"]=str(ii)
    rowcounter = rowcounter + 1
#colatnum.append("comment")
#colatnum.append("sample_no")
#colatnum.append("row_no")
df.columns = colatnum
#dfextras.columns = ["comment","sample_no","row_no"]
resultdf = pd.concat([df,dfextras],axis=1)
Title: Re: reading CAMECA files without software
Post by: Ben Buse on October 21, 2025, 08:24:15 AM
@sem-geologist I should add I've just tried your HussariX program (I removed two lines of code from spectrumWidgets - import warnings and np.rankwarnings - which gave error on my machine), it's very nice, and could be compiled and widely used as a CAMECA spectra tool. How did you choose the name Hussari?

Title: Re: reading CAMECA files without software
Post by: sem-geologist on October 22, 2025, 01:54:39 AM
Quote from: Ben Buse on October 21, 2025, 08:24:15 AM@sem-geologist I should add I've just tried your HussariX program (I removed two lines of code from spectrumWidgets - import warnings and np.rankwarnings - which gave error on my machine), it's very nice, and could be compiled and widely used as a CAMECA spectra tool. How did you choose the name Hussari?


Thanks for trying. It is a bit not up to date, numpy introduce some changes and probably that caused the Error.
It is still not feature-full. You can only preview X-ray lines, but it does not burn them into plot - it is a bit complicated as I want to avoid marker-overcrowd compared to Peaksight do when marking many elements and many diffraction orders.
Hopefully I will find some time to update it (still quite a list of TODO Hardware fix in lab, we have two EPMAs there (SX100, SXFiveFE)), make the installer (I think that would be cool, if it would be installable with pip or conda).

As for name, I have some roots and connections in both Lithuanian and Polish. Hussars of historic Polish-Lithuanian Commonwealth had its merit in tactical speed - the lightweight cavalry easy deploy-able for fixing some urgent problems. So to highlight that priority of software is speed and lightweight (minimal dependencies) I used "Hussari" part, and X is for X-ray analysis.