Véase Comma Separated Values en la Wikipedia:
A comma-separated values (CSV) file stores tabular data (numbers and text) in plain-text form. A CSV file consists of any number of records, separated by line breaks of some kind; each record consists of fields, separated by a comma. All records have an identical sequence of fields.
split
por las comas:
> x = '"earth",1,"moon",9.374' '"earth",1,"moon",9.374' > y = x.split(/,/) [ '"earth"', '1', '"moon"', '9.374' ]
Esta solución deja las comillas dobles en los campos entrecomillados.
Peor aún, los campos entrecomillados pueden contener comas, en cuyo
caso la división proporcionada por split
sería errónea:
> x = '"earth, mars",1,"moon, fobos",9.374' '"earth, mars",1,"moon, fobos",9.374' > y = x.split(/,/) [ '"earth', ' mars"', '1', '"moon', ' fobos"', '9.374' ]
La siguiente expresión regular reconoce cadenas de comillas dobles con secuencias de escape seguidas opcionalmente de una coma:
> x = '"earth, mars",1,"moon, fobos",9.374' '"earth, mars",1,"moon, fobos",9.374' > r = /"((?:[^"\\]|\\.)*)"\s*,?/g /"((?:[^"\\]|\\.)*)"\s*,?/g > w = x.match(r) [ '"earth, mars",', '"moon, fobos",' ]
Esta otra expresión regular /([^,]+),?|\s*,/
actúa de forma parecida al split
.
Reconoce secuencias no vacías de caracteres que no contienen comas seguidas opcionalmente
de una coma o bien una sóla coma (precedida opcionalmente de blancos):
> x = '"earth, mars",1,"moon, fobos",9.374' '"earth, mars",1,"moon, fobos",9.374' > r = /([^,]+),?|\s*,/g /([^,]+),?|\s*,/g > w = x.match(r) [ '"earth,', ' mars",', '1,', '"moon,', ' fobos",', '9.374' ]
<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <title>CSV Analyzer</title> <link href="global.css" rel="stylesheet" type="text/css"> <script type="text/javascript" src="../../underscore/underscore.js"></script> <script type="text/javascript" src="../../jquery/starterkit/jquery.js"></script> <script type="text/javascript" src="csv.js"></script> </head> <body> <h1>Comma Separated Value Analyzer</h1> <div> <i>Write a CSV string. Click the table button. The program outputs a table with the specified data.</i> </div> <table> <tr> <th>CSV string:</th> <!-- autofocus attribute is HTML5 --> <td><textarea autofocus cols = "80" rows = "5" id="original"></textarea></td> </tr> </table> <button type="button">table:</button><br> <span class="output" id="finaltable"></span> </body> </html>
html * { font-size: large; /* The !important ensures that nothing can override what you've set in this style (unless it is also important). */ font-family: Arial; } h1 { text-align: center; font-size: x-large; } th, td { vertical-align: top; text-align: right; } /* #finaltable * { color: white; background-color: black; } */ /* #finaltable table { border-collapse:collapse; } */ /* #finaltable table, td { border:1px solid white; } */ #finaltable:hover td { background-color: blue; } tr:nth-child(odd) { background-color:#eee; } tr:nth-child(even) { background-color:#00FF66; } input { text-align: right; border: none; } /* Align input to the right */ textarea { border: outset; border-color: white; } table { border: inset; border-color: white; } table.center { margin-left:auto; margin-right:auto; } #result { border-color: red; } tr.error { background-color: red; } body { background-color:#b0c4de; /* blue */ }
// See http://en.wikipedia.org/wiki/Comma-separated_values "use strict"; // Use ECMAScript 5 strict mode in browsers that support it $(document).ready(function() { $("button").click(function() { calculate(); }); }); function calculate() { var result; var original = document.getElementById("original"); var temp = original.value; var regexp = /_____________________________________________/g; var lines = temp.split(/\n+\s*/); var commonLength = NaN; var r = []; // Template using underscore var row = "<%% _.each(items, function(name) { %>" + " <td><%%= name %></td>" + " <%% }); %>"; if (window.localStorage) localStorage.original = temp; for(var t in lines) { var temp = lines[t]; var m = temp.match(regexp); var result = []; var error = false; if (m) { if (commonLength && (commonLength != m.length)) { //alert('ERROR! row <'+temp+'> has '+m.length+' items!'); error = true; } else { commonLength = m.length; error = false; } for(var i in m) { var removecomma = m[i].replace(/,\s*$/,''); var remove1stquote = removecomma.replace(/^\s*"/,''); var removelastquote = remove1stquote.replace(/"\s*$/,''); var removeescapedquotes = removelastquote.replace(/\"/,'"'); result.push(removeescapedquotes); } var tr = error? '<tr class="error">' : '<tr>'; r.push(tr+_.template(row, {items : result})+"</tr>"); } else { alert('ERROR! row '+temp+' does not look as legal CSV'); error = true; } } r.unshift('<p>\n<table class="center" id="result">'); r.push('</table>'); //alert(r.join('\n')); // debug finaltable.innerHTML = r.join('\n'); } window.onload = function() { // If the browser supports localStorage and we have some stored data if (window.localStorage && localStorage.original) { document.getElementById("original").value = localStorage.original; } };
Casiano Rodríguez León