AI in Reference and Background Checks: Verify Facts, Not Character, practitioner guidance from TheAICommand
← HR & AI
Hiring

AI in Reference and Background Checks: Verify Facts, Not Character

AI is arriving in the verification stage of hiring: tools that draft reference questions, summarise calls, scrape digital footprints and score candidates. The admin is worth automating; the judgement, the collection decisions and the fairness are not. Here is the line, a five-step process that holds it, and two ready prompts.

People & Culture. Written for Australian HR and people teams. General information only. Not legal or HR advice. Employment decisions stay with people.

Quick answer

Use AI to draft consistent reference questions, structure call notes and summarise what a referee actually said. Do not let it scrape a candidate's digital footprint or generate a hire risk score. The Privacy Act limits what you collect, the Fair Work Act prohibits adverse action, and a person must own every hiring decision.

A model can now write your reference questions, summarise the call, and hand you a candidate risk score in seconds. Only one of those jobs is safe to give it.

Reference and background checks are the verification stage of hiring. By the time a firm gets here, it usually has a preferred candidate and wants to confirm the story: did they do the job they say they did, is there anything that makes them unsuitable, and can we move to an offer. It is exactly the kind of slow, repetitive, judgement-heavy work AI vendors are now targeting. Some of what they offer is genuinely useful. Some of it will walk an HR team straight into a Privacy Act problem or an adverse action claim. The difference is whether the tool is doing the admin or making the call.

What is actually happening

Three kinds of AI are showing up in this stage, and they are not equal.

The first is drafting and summarising. AI can generate a consistent set of role-relevant reference questions, capture structured notes during a reference call, and produce a neutral summary of what a referee actually said. This is admin, and it is where the value is.

The second is candidate screening at scale. Tools that scrape a candidate's public digital footprint, their social media, forums, old posts, and assemble a profile or a "risk" flag. Some go further and infer things the candidate never disclosed. This is where the risk lives.

The third is automated scoring. A tool that takes the reference and background inputs and returns a hire or no-hire recommendation, or a numeric risk score. This is the part that must never be the decision.

The trap is that all three arrive in the same product, framed as one seamless "background intelligence" feature. HR has to unbundle them, because Australian law treats them very differently.

Picture the common scenario. A firm has a preferred candidate, [CANDIDATENAME], for a [ROLE]. A background-intelligence tool offers to run the references, verify the employment history, and, on the same dashboard, generate a "digital reputation" score built from a sweep of the candidate's public posts. The first two are ordinary verification. The third collects information the firm never decided it needed, about matters that may have nothing to do with the job, and it packages a judgement the firm has not made. Accepting that score as part of the file is where the risk enters, quietly, because it looks like just another number on the same screen.

The guardrails that decide the line

Three bodies of law shape what an HR team can and cannot do here, and they are not vague.

The Privacy Act sets the collection limits. Under Australian Privacy Principle 3, an organisation may only collect personal information that is reasonably necessary for its functions, and may only collect sensitive information, which includes health, criminal record and more, with the individual's consent, unless an exception applies. The Office of the Australian Information Commissioner is explicit that consent is not inferred simply because you gave someone notice of a collection. It also requires collection by lawful and fair means. An AI tool that scrapes a candidate's entire digital footprint collects far more than is reasonably necessary for assessing suitability for a specific role, often collects sensitive information without valid consent, and frequently does so by means a candidate would not reasonably expect. That is three APP 3 problems in one feature.

Side-by-side split contrasting a soft sage glow drafting reference questions on the left with a wide indiscriminate scrape of a digital footprint on the right
Draft the questions. Do not scrape the whole person.

If any of that screening happens offshore, and much of it does, APP 8 adds a cross-border layer. The OAIC's own worked example in its APP 8 guidance is a recruitment drive where an Australian entity sends job applicant information to an overseas provider to run reference checks. That is a cross-border disclosure with its own obligations. A background-check vendor processing your candidates on servers overseas is not a detail to skip in procurement.

The Fair Work Act sets the fairness limit. Its general protections make it unlawful to take adverse action against a prospective employee because of a protected attribute: race, colour, sex, sexual orientation, gender identity, age, physical or mental disability, marital status, family or carer's responsibilities, pregnancy, religion, political opinion and more. Refusing to employ someone is adverse action. So if an AI screen surfaces or infers a protected attribute, a pregnancy announcement, an age, a disability, a religious affiliation, and that information influences the decision not to hire, the employer is exposed. The problem with AI-scraped profiles is that they surface exactly this kind of protected information as a matter of course, and once it is in front of the decision maker it is very hard to prove it did not count.

The Sex Discrimination Act adds the positive duty. Employers have had a positive duty since December 2022 to take reasonable and proportionate measures to eliminate sex discrimination and harassment as far as possible, and since 12 December 2023 the Australian Human Rights Commission has had the powers to investigate and enforce compliance, a sequence Norton Rose Fulbright's summary of the reforms sets out plainly. A hiring process that runs candidates through an opaque AI screen no one has assessed for bias is hard to describe as a reasonable and proportionate measure. Governing your screening tools is now part of the duty, not a nice-to-have.

Verify a fact, do not infer a character

The cleanest way to hold the line is to separate two things the tools deliberately blur. Verifying a fact is confirming something checkable against a source: that [CANDIDATENAME] held the title they listed, on the dates they listed, and that a named referee would work with them again. That is legitimate, bounded, and mostly admin, and AI can help capture and summarise it. Inferring a character is a model producing a judgement about who someone is, a risk score, a personality read, a reputation rating, from indirect signals it scraped. That is not verification; it is speculation dressed as data, and it is where over-collection, inaccuracy and discrimination all live at once. A safe process collects verifiable facts and lets a person form the judgement. It never lets a model form the judgement and then treats it as a fact.

Seven questions to ask before you sign

Because so much of this stage now runs through vendors, the procurement decision is a control in its own right. Put these to any screening or background-intelligence vendor, and keep the answers in the file:

  1. What exactly does the tool collect about a candidate, and can each collection be switched off individually?
  2. Can the automated scoring or risk rating be disabled entirely while keeping the verification and drafting features?
  3. Where is candidate data processed, and if any step is offshore, what does your APP 8 position rest on?
  4. Have you tested the tool for bias against protected attributes, and can we see the method and the results?
  5. What sources feed any digital footprint or social media feature, and how do you handle mistaken identity?
  6. How long do you keep candidate data after the check, and can we direct deletion when it is finished?
  7. When the tool gets something wrong about a candidate, what does the contract say about who is responsible?

A vendor who cannot answer these has told you something useful. A tool you cannot bound is a tool that collects and decides on your behalf, and you wear the consequences, not the vendor.

The five-step process that holds the line

Here is a process that keeps AI on the admin and a person on every decision. Use placeholders, never real candidate data, when you build and test it.

  1. Scope before you collect. Decide, for the specific role, what verification is actually reasonably necessary: confirming employment history and dates, confirming qualifications where they are an inherent requirement, and role-relevant referee input. Write it down. Anything outside that list needs a specific justification, not a default scrape.
  2. Get real consent, and be specific. Tell [CANDIDATENAME] exactly what you will check, how, and by whom, including any third-party or offshore provider, and get consent for anything sensitive. Notice alone is not consent. If you use a screening vendor, that consent has to cover them.
  3. Let AI draft the questions and structure the notes. Have the model produce a consistent, role-relevant set of reference questions for the [ROLE], the same for every candidate so comparisons are fair, and use it to capture structured notes and a neutral summary of what [REFEREENAME] actually said. This is where AI earns its place: consistency and completeness.
  4. Verify against the source, not the summary. Treat the AI summary as a draft. Confirm the load-bearing facts, dates, title, whether they would re-hire, against what the referee said and the documents, not against the model's paraphrase. If a background report contains adverse information, give the candidate a genuine chance to respond before it counts against them.
  5. A person decides, and records why. The hiring decision, and the weight given to each piece of verification, is made and documented by a person. No automated risk score is the decision. If the answer is no, the recorded reason is a role-relevant, defensible one, never a protected attribute.
Process-flow diagram of five ascending sage nodes connected by a flowing line, from scope to a human decision
Five steps: scope, consent, AI drafts, verify the source, a person decides.

A worked example, end to end

Here is how the process runs on one file, with every identifier replaced by a placeholder.

The situation. A firm has a preferred candidate, [CANDIDATENAME], for a payroll team leader role. The verification scope, written down before anything is collected, is three items: employment history and dates at the last two employers, the payroll qualification on the CV because it is an inherent requirement, and referee input against the role's genuine requirements. No social media screen; nobody could justify one as reasonably necessary for this role.

The prompt. The HR adviser pastes the first prompt below into the firm's approved AI tool, with the role's inherent requirements: running a fortnightly payroll for around 400 staff, supervising two payroll officers, and meeting award interpretation deadlines.

What came back. Nine open questions and a verification checklist. One draft question asked how the candidate balances family commitments with month-end deadlines. The model flagged its own output as risking a protected attribute, family or carer's responsibilities, and rewrote it to ask about a time the person managed competing deadlines. The adviser kept the rewrite and used the same nine questions for both referee calls.

The check. The adviser ran the second prompt over de-identified notes from the call with [REFEREENAME]. The AI summary recorded the referee as hesitant on the rehire question. The adviser's own notes said the opposite: the hesitation was about whether the new role was senior enough, and the referee said they would rehire without reservation. The summary was corrected against the source before it went anywhere near the file.

The decision. The hiring manager, not the tool, weighed the verified facts, recorded a role-relevant reason, and made the offer. Nothing about the judgement was delegated.

That correction in the fourth step is the whole argument in one line: the AI summary was useful, and it was wrong, and only a person checking it against the source caught it.

Two prompts to run the safe part

Both prompts are drafting aids for ChatGPT, Claude or equivalent, built to refuse the unsafe part of the job. Paste real identifiers into neither: de-identify first and re-identify offline.

The first drafts a fair, consistent reference question set for a role.

Prompt
You are an HR adviser helping an Australian employer run fair, consistent reference checks. You draft role-relevant questions and a verification checklist. You do NOT assess candidates, score them, or recommend hiring decisions.

CONTEXT TO USE:
- Only collect what is reasonably necessary for THIS role (Privacy Act APP 3). Sensitive information needs consent.
- Ask every candidate the same core questions so comparisons are fair.
- Never ask about, or infer, protected attributes (age, sex, race, disability, family or carer's responsibilities, pregnancy, religion and more).

YOUR TASK:
1. Produce 8 to 10 open reference questions tied to the inherent requirements, phrased neutrally, the same set for every candidate.
2. Produce a verification checklist: the facts to confirm against source documents (employment dates, title, qualifications where they are an inherent requirement) and how to record each one.
3. Flag any question or check that risks collecting more than is reasonably necessary or surfacing a protected attribute, and rewrite it.

HUMAN BOUNDARY: this is a drafting aid, not advice. A person runs the checks, gives the candidate a chance to respond to any adverse information, decides, and records a role-relevant reason. Do not output a score or a hire recommendation.

INPUTS:
1. Role and inherent requirements: [ROLE_AND_INHERENT_REQUIREMENTS]
2. Role-relevant things to confirm: [ITEMS_TO_CONFIRM]

The second structures and summarises a reference call from your de-identified notes, without evaluating anyone.

Prompt
You are an HR adviser's drafting assistant. You structure and summarise reference call notes for an Australian employer. You do NOT evaluate the candidate, score them, or recommend a decision.

RULES:
- Work only from the notes I paste. Do not add, infer or embellish anything.
- Where the notes are ambiguous, mark the item [UNCLEAR: check with referee] rather than guessing.
- Never infer or record protected attributes (age, sex, race, disability, family or carer's responsibilities, pregnancy, religion and more). If the notes contain one, flag it as [PROTECTED_ATTRIBUTE: exclude from file].
- Use the placeholders exactly as pasted: [CANDIDATE_NAME], [REFEREE_NAME], [ROLE]. The notes are de-identified and will be re-identified offline.

YOUR TASK:
1. Produce a structured summary under these headings: relationship and context; facts verified (dates, title, duties); answers to the standard questions; direct quotes worth keeping; items to verify against documents; unclear items to follow up.
2. Beside every load-bearing fact, quote the exact wording from my notes so a person can check the summary against the source.
3. Do not output any score, rating, ranking or hiring recommendation.

NOTES FROM THE CALL:
[PASTE_DEIDENTIFIED_NOTES]

Do this Monday

  1. Pull the reference questions you used for your last hire and count how many were genuinely tied to the role's inherent requirements. That gap is your baseline.
  2. Write a one-page collection scope for your next vacancy: what you will verify, why each item is reasonably necessary, and who will see it.
  3. Paste the first prompt into ChatGPT, Claude or equivalent with that role's inherent requirements, and save the question set it returns into your recruitment templates.
  4. Update your candidate consent wording so it names every check, every third-party provider, and any offshore processing, then have it reviewed.
  5. Run the second prompt over notes from an old, fully de-identified reference call, and check every line of the summary against the source.
  6. Send the seven vendor questions to whoever owns your screening tool contract, and diarise the answers.
  7. Brief your hiring managers in one paragraph: AI drafts and summarises, a person verifies, decides and records why.

What never to automate

Some parts of this stage are bright lines. Do not let a model make the hire or no-hire decision, or reduce a person to a risk score you act on without reading. Do not run indiscriminate social media or web scraping on candidates as a default control; it collects too much, it is often wrong or about the wrong person, and it drags protected attributes into the decision. Do not feed a candidate's sensitive information, health, criminal history, anything you would not want leaked, into a public AI tool with no data agreement. And do not skip the candidate's right to respond to adverse information; automating them out of that is both unfair and legally risky. Finally, do not keep what you did not need. Sensitive candidate information gathered for a check that is now finished should not linger in a model's history or a vendor's store; collect narrowly, then dispose of it under your retention rules. Every one of these bright lines is a place where a shortcut trades a small time saving for a real, and often personal, harm to someone who cannot even see the tool that judged them.

Cinematic concept scene of a single human hand holding a steady line over a soft machine glow, deep navy behind, one focal point
The tool verifies. A person decides, and records a defensible reason.

The verification stage of hiring is high stakes precisely because it is where a candidate is most exposed and least able to push back. AI can make it faster and fairer by standardising the questions and organising the evidence. It cannot decide who is suitable, and it cannot be trusted to collect responsibly on its own. Keep it on the process. Keep the judgement, the collection decisions and the fairness with a person. That is not caution for its own sake. It is what the Privacy Act, the Fair Work Act and the positive duty already require, and it is what a candidate is owed.

TheAICommand. Intelligence, At Your Command.

Frequently asked questions

Can I use AI to run reference and background checks?
Use AI for the admin: drafting a consistent set of reference questions, capturing structured notes, and summarising what a referee said so nothing is missed. Do not use it to auto-generate a candidate risk score, scrape social media by default, or decide whether to hire. The Privacy Act limits what you collect and a person must own the decision.
What does the Privacy Act allow me to collect about a candidate?
Under APP 3 you may only collect personal information that is reasonably necessary for the role, and you may only collect sensitive information (such as health or criminal record) with the individual's consent, unless an exception applies. Consent is not inferred just because you gave notice. An AI tool that hoovers up a candidate's whole digital footprint collects far more than is reasonably necessary.
Is it legal to screen a candidate's social media with AI?
It is high risk. Automated social-media scraping collects personal and often sensitive information that is rarely reasonably necessary for the role, is frequently inaccurate or about the wrong person, and can surface protected attributes that then taint the decision, exposing you to adverse action claims under the Fair Work Act. If you screen at all, scope it tightly, get consent, and keep a person in the loop.
What are the discrimination risks when AI screens candidates?
Refusing to employ someone because of a protected attribute (age, sex, race, disability, family responsibilities and more) is adverse action against a prospective employee under the Fair Work Act. If an AI tool surfaces or infers a protected attribute and it influences the decision, the employer wears the risk. The Sex Discrimination Act positive duty also requires employers to take proactive steps to eliminate discrimination.
Who is accountable if an AI background check is wrong?
The employer. An AI summary can be confidently wrong, mix up two people, or rely on stale data, but the hiring decision and its consequences sit with the organisation, not the vendor. That is why the referee's own words, the candidate's right to respond to adverse information, and the final call all stay with a person.
HiringBackground ChecksReference ChecksPrivacy ActFair WorkPeople & Culture
← Back to HR & AI