• 0 Posts
  • 12 Comments
Joined 3 months ago
cake
Cake day: August 2nd, 2024

help-circle

  • But you know for the olympics, the government really needed the extra surveillance cameras after all wouldn’t you want everyone to be safe? And who hates the sports? I mean we all love them and its good for us and it’s good for the economy as well. And it will put our country on the world map, and there will be so many tourists, after all we love tourists and diversity, wouldn’t that be nice?

    And just imagine oh imagine if one of the terrorists got here pretending to be a tourist, and ruin everything. You wouldn’t want them to ruin all the fun? Would you?

    So you see the government really really needs to tighten the security by adding these surveillance cameras, and rest assured it will use all the latest technologies to help YOU be SAFE. /s



  • Not necessarily, CVs have complicated formatting. Nobody (should) write blocks of text, and you don’t know how many columns the candidate is using. Is the candidate using a specific section to show star based skill rating or word based? So you can still search for individual keywords but if you try copying the whole pdf and paste it in txt (which is what will be forwarded to ATS), it does not make much sense. The structure is too complicated extract where you studied, what did you studied and your grade, what other experiences you have and how long you worked there etc.

    Extracting structured data is in its own right a different field of science. There is plenty of recent research on extracting structured data from academic pdfs (I was working on this in a research institute in germany around 2022), even when LLMs are used it can get really complicated to the point that there are specialized LLMs for just that.

    But ATS systems are cheap/not high enough priority to even use OCR let alone LLMs so unfortunately the responsibility of making an easily parsable CV comes down to the candidate.

    Try this next time you see your CV, copy its text to a txt then think about if you can write a program that can reliably extract your experience, education, interests etc. Its going to be super difficult and even then it won’t generalize to thousands of other CVs.


  • I think OCRs are really good nowadays but i think old ATS systems don’t use them or at least use old OCR. If you parse a pdf (without OCR) a word exported pdf preserve the text order much better than a latex ones.

    Like i actually tried some websites and python libraries to extract the text from my latex pdf, none of them gave good results like words inside pdf would be out of order.

    If i use ocr then I get good coherent text. Which is really important for ATS but I doubt people use OCRs cuz they are kinda expensive or maybe people just use old ATS systems etc



  • just_an_average_joe@lemmy.dbzer0.comtoLinkedinLunatics@sh.itjust.worksPDFs
    link
    fedilink
    English
    arrow-up
    26
    arrow-down
    10
    ·
    1 month ago

    Actually this is good advice. Nowadays nobody reads your CV in the first step. Your CV first gets through an automated system (ATS i think its called). It’s designed to filter out as much as possible.

    The problem with PDF is that it’s terrible to parse cuz it’s designed for humans reading it, not machines. The only reliable way to parse it is by converting it to images and then OCR, which is kinda expensive.

    So before you send a PDF, you should first try to convert it to txt and see if the content make enough sense. Or just use word to make a CV then export to PDF.

    When i was looking for a job, i remember there was a website that would give you tips on your CV and they had an ATS report of your CV. I was so shocked to realize that ATS totally messed up completely to parse the correct info from my latex CV. Like I have a lot of AI/ML experience and it completely missed it and thought i had quality assurance one. And i was applying for AI jobs, no wonder I couldn’t get any interviews. Then I changed it to word and an exported pdf where word wasn’t accepted. I got many more interviews after that.