The problem is usually the orignal encoding of the source file.
But sometime it is about the end of line problem from different OS, just use dos2unix and try not to open the files in windows OS systems.
brew install dos2unix
dos2unix filename.csv
check the file type in linux CMD:
file -I filename.csv
the result should be something like the below. any other encoding like iso-8859-8 or UTF -8 should produce Hebrew, so your file will be probably in different encoding
text/plain; charset=iso-8859-1
Than the challenge would the convert…
3 ways to convert your text gibrish file into Hebrew:
- Microsoft XL , rename the file to filename.txt and open file, it will open a wizard letting you choose the encoding.
- Linux CMD:
iconv -f iso-8859-1 -t utf-8 < file > file.new
- online encoding convertor to utf 8 i used : https://subtitletools.com/convert-text-files-to-utf8-online
Trying testing the file locally – if you see Hebrew on your desktop , you should be fine on Athena.
Have fun!
—————————————————————————————————–
- Contact me via linked in Omid Vahdaty
- website: https://amazon-aws-big-data-demystified.ninja/
- Join our meetup, FB group and youtube channel
- Join our meetup : https://www.meetup.com/AWS-Big-Data-Demystified/
- Join our facebook group https://www.facebook.com/groups/amazon.aws.big.data.demystified/
- subscribe to our youtube channel https://www.youtube.com/channel/UCzeGqhZIWU-hIDczWa8GtgQ?view_as=subscriber
——————————————————————————————————————————
I put a lot of thoughts into these blogs, so I could share the information in a clear and useful way. If you have any comments, thoughts, questions, or you need someone to consult with, feel free to contact me:
https://www.linkedin.com/in/omid-vahdaty/