Ethical Google Hacking - Sensitive Doc Dork (Part 2)
Intro: What Is Google Hacking?
- Leverage the power of Google for recon through advanced search operators
- Pioneered by Johnny Long
- Google Hacking for Penetration Testers, Third Edition
- Google Hacking Database
- Search functionality needs improvement
- Good for viewing categories
- Search functionality needs improvement
- https://github.com/JohnTroony/Google-dorks/blob/master/google-dorks.txt
- Good for overall search
- After we learn a search operator, search here for other applications
Intro: General Info.
- Level: All
- Course Notes and Errata
- Focus on search operators in practical situations
- Not every Google search operator will be covered
- Focus on search operators in practical situations
- Examples
- Phishing campaign simulation
- Leveraging financial documents from Board Of Director’s meetings
- Searching nginx logs for error responses while adjusting for output variability
- How to find admin area source code (or other functionality)
- Search timestamp ranges within PHP error logs
- Phishing campaign simulation
Intro: But Why?
- From a recon perspective, why is Google hacking advantageous?
- Passive
- Hard to trace
- Google makes the connection to the site, not you
- Passive
Intro: Additional Help
- Search for
Google Dorking- Synonymous with Google Hacking
- Helpful websites
- http://www.googleguide.com/contents/
- https://www.google.com/advanced_search
- Not as powerful
- Can leverage if you need to get going ASAP
Sensitive Doc Dork: Background
- Scenario
- Phishing campaign will be aimed at
- State employees who recently got a raise through the budgeting process
- “confidential employees”
- State employees who have access to privileged information
- Phishing campaign will be aimed at
- Phish email from “HR”
- “Due to your recent salary adjustment, we need to confirm your banking information. Click here to confirm your bank account on file”
- Link will redirect to a fake employee portal that steals login credentials
- Social Engineering
- Trust is implicitly built through disclosure of sensitive information
- This information is commonly found via Google Dorks
- Trust is implicitly built through disclosure of sensitive information
- “Due to your recent salary adjustment, we need to confirm your banking information. Click here to confirm your bank account on file”
Sensitive Doc Dork: Logical Operators
filetype:(doc | pdf | xls | txt | rtf | odt | ppt ) intext:(confidential salary| "salary schedule")()- Logical grouping
OR- Note the uppercase
- AKA
| - If there isn’t an explicit
|, anANDis implied- Within text
- Googling
WPA2 KRACK Vulnerability
- Googling
- Adjacent search operators
- Within text
Sensitive Doc Dork: filetype Operator (1/2)
filetype:(doc | pdf | xls | txt | rtf | odt | ppt ) intext:(confidential salary| "salary schedule")- NOT
filetype: (doc | pdf | xls | txt | rtf | odt | ppt)- True for all operators
- Common file formats indexed by Google
- Can search for file extensions not on this list
- Ex:
filetype:md- Q: Why would this be useful?
- Ex:
- Can search for file extensions not on this list
Sensitive Doc Dork: filetype Operator (2/2)
filetype:(doc | pdf | xls | txt | rtf | odt | ppt ) intext:(confidential salary| "salary schedule")filetype:pdf- Will return
- this-is-actually-a-pdf.txt
- Common file formats indexed by Google
- Extensions on this list are guaranteed to have this special property
- Common file formats indexed by Google
- this-isn’t-a-pdf.pdf
- this-is-a-pdf.pdf
- this-is-actually-a-pdf.txt
- Will return
Sensitive Doc Dork: intext Operator
filetype:(doc | pdf | xls | txt | rtf | odt | ppt ) intext:(confidential salary| "salary schedule")- Helpful for constraining a search to a document’s body
- Regular Google search can match page titles, items in the url path, etc.
intext:(confidential salary | "salary schedule")- Has
confidentialANDsalarysomewhere in the text body "salary schedule"- Exact match
- We leave this search relatively vague to capture other results of interest
- We don’t do
intext:("confidential employee" | "salary schedule")
- We don’t do
- Has
- Problems
- Look at query
- Too many false positives
Sensitive Doc Dork: inurl Operator
filetype:(doc | pdf | xls | txt | rtf | odt | ppt ) intext:(confidential salary |"salary schedule") inurl:(confidential "board approved */*/17")confidential- Must be in the url
"board approved */*/17"- Can be in the url or anywhere within the document
*is expanded to one or more words- Cross-check via bold in the results output
- Search Ex.
- Cross-check via bold in the results output
- Doesn’t work well for non-words
- Ex:
inurl:"sid=*"sidis for a PHP session- Through a proxy log dork, we can find sensitive urls/url parameters
- Ex:
Proxy Log Dork: Why Search Through Proxy Logs?
- URL Data leakage
- Common for all GET requests/responses to be logged at the proxy layer
- Best practice is to place sensitive information within the POST body
- Ex: session id, sensitive tokens, SSNs, etc.
- Often placed within a GET request
- Ex: session id, sensitive tokens, SSNs, etc.
- Abnormal response codes
- https://en.wikipedia.org/wiki/List_of_HTTP_status_codes
- Example in next slide
Proxy Log Dork: AROUND(X) Operator
TERM 1AROUND(2)TERM 2TERM 1is within 2 words ofTERM 2- Capitalize
AROUNDfor more consistent results
- Useful for searching documents where the ordering of terms can be customized
- Ex: Logs
- Nginx Log Ex:
- - - - "GET / HTTP/1.1" STATUS_CODE - - -- Put
-to simplify
- Put
filetype:log inurl:(access.log | error.log) intext:("HTTP/" AROUND(5) 500)-site:github.com
Proxy Log Dork: site Operator
- Scopes a search to a particular domain
- Can even be a TLD
site:.net
site:github.com- Will match
github.comand*.github.com - Leave out
wwwto ensure search of all subdomains
- Will match
filetype:log inurl:(access.log | error.log) intext:("HTTP/" AROUND(5) 500)-site:github.com-site:github.com- Helps us remove example logs that are false positives
- Or are they?
- For targeted search don’t discard
- Stackoverflow for recon
- Or are they?
- Helps us remove example logs that are false positives
- Review Ex.
- Operator
-site:github.com -next -last -reply -"I want to leave this out"- Make sure search results don’t contain a given…
- operator
- word
-nextwill help negate help forum results
- phrase
Error Log Dork: X..Y (Range) Operator
- Finds a given number range
warning error on line php filetype:log 2015..2017- Search php error logs for a given timestamp
- Ex.
Admin Functionality Dork: inanchor Operator
- Finds text within a link/anchor
inanchor:admin site:hackthissite.org- Great way to find admin portals
- How can we remove some of the clutter from the results?
- Ex.
Admin Functionality Dork: intitle Operator
- Searches through page titles
inanchor:admin site:hackthissite.org -intitle:"view topic"- Why did the source code come up in the results?
- Ex.
Further Learning
- Overall Google functionality
- Operators
- Tools
- Dork lists