Unicode Converter - Decimal, text, URL, and unicode converter. As you type in one of the text boxes above, the other boxes are converted on the fly. The Unicode converter doesn't automatically add spaces between the converted values. You can use the add spaces button to separate the Unicode characters so that the converted values will also be separated from one another.
I am working on a project that uses a CSV file supplied by the client to populate a database. The client is to create the spreadsheet, then save it as a CSV to be uploaded, which is then parsed into a database.
The problem is, whenever the client saves the excel spreadsheet as an MS-DOS .csv file, many of the special characters get converted to question marks '?' (symbols such as ' ' / ). However, if we then open the CSV file and manually replace each ? with the right character it works fine.
The problem is that the data file is HUGE, and we can't reasonably do this, so I was hoping there was a way to save it. We have tried exporting as Unicode and ASCII to no avail. We have also tried uploading to google docs, and re-saving, however, it also breaks those characters.
migrated from stackoverflow.comAug 25 '12 at 16:15
This question came from our site for professional and enthusiast programmers.
12 Answers
Ensure you are choosing to save as a CSV (Comma Delimited)
and not a CSV (MS-DOS)
, as DOS doesn't support UTF-8 characters.
I have found that the lost character issue only happens (in my case) when saving from xlsx format to csv format. I tried saving the xlsx file to xls first, then to csv. It actually worked.
A possible workaround is to save it as Unicode Text
(2007 has it, not sure about previous editions), which saves it as a tab-separated text file.
This file preserved my unicode characters (in my case I was working with asian characters) while producing some sort of delimited text file which you can then run through external tools to convert to a csv if necessary.
My input did not have tabs embedded within each cell, however, and I am not sure how that would be handled.
Here's what works for me:
- Make data corrections in Excel or CSV
- Save file As Unicode Text
- Open NOTEPAD
- Open the Unicode file you just saved using NOTEPAD
- Use your cursor to highlight a blank area that holds a single tab 5a. Use the space between the Acquire Id and the Request Type because this holds ONE TAB!
- Hit Cnrl-C to copy the tab character
- Type Cnlr-H to open the Replace function box
- Click in the Find What text box and type Cnlr-V to paste the Tab
- Click in the Replace With Text box and type a comma
- Click Replace to test it one time. Confirm the tab in the file is replaced with a comma
- Click Replace All
- Click Cancel
- Save the file and Exit
- In Windows Explorer change the file extension to .csv
I've been having this issue for a while now, and finally dedicated some time to figure it out! I was able to (seemingly) fix the issue by saving as 'Windows Comma Separated (.csv)'. I tried it from a .xlsx and a .xls, both converted over to a .csv just fine. Hope this helps - let me know if any issues pop up with this method. I'll report back if I see anything over the next few weeks.
Solution:
Open your CSV file in Notepad (you will notice it says ANSI), then re-save it as UTF-8 in Notepad.
Then the import should work. If however, you open it again in Excel and just Save it won't work because Excel can't natively encode in UTF-8 apparently.
The other option is to work with your CSV file in www.LibreOffice.org (free), which can encode correctly in UTF-8 (I haven't tried this myself though).
I ran into a similar issue with importing a csv of products with fractions into WooCommerce, which was getting rejected because of '?' error characters. This is because the CSV file was not encoded in UTF-8. HOWEVER, even after saving the CSV file in Excel with UTF-8 encoding it still didn't work.
After scratching through multiple forums, it appears that although Excel gives the option to encode as UTF-8 - when SAVING AS A CSV, Excel automatically converts it back to ANSI, even though you select UTF-8.
Twisty ImpersonatorEven i faced the issue with special characters while downloading the japanese terms in .csv format.However when i saved the .csv file in Text format(Tab delimited) the japanese characters where populating perfectly. Then i just copied data from text file and pasted in a spreadsheet. Worked well!!!
Thanks,Vaishakh
Found the best solution ever: http://woshka.com/blog/microsoft/microsoft-excel/solve-the-problem-saving-excel-csv-format-with-utf-8-unicode-encoding.html
From link:
1-Click on the start menu
2-Select control panel
3-Find Regional and Language options on classic mode or type its name on the search bar on just top right of the control panel window
4-Click on advanced tab and click locales
5-Click on Persian or Arabic or your desired UTF-08 encoding program that you’d like to save with Excel into CSV
Excellll- Download and install Unicode CSV Addin for excel.
- Save the csv from the new 'Unicode CSV' menu as shown in picture below.
for the ™ I found a solution. In the .xlsx file, replace all '™' with '&tr-ade;'.remove the - from the replace with.Save the file as .csvand all is done.wish it works for you.
On a Mac computer, this is what worked for me.
On excel choose save as and then from the drop-down choose windows comma separated (CSV).
It just works!!!
Save as unicode Rename unicode file as .csv (https://www.youtube.com/watch?v=1VP8__shxTg)