
juan.alcolea at bt
Sep 11, 2001, 4:29 AM
Post #1 of 4
(403 views)
Permalink
|
|
Asking for advice: Using Python for data validation
|
|
Hi! I need a piece of advice: In a little project where I am currently working, we must receive and load data from very different sources and = very different geographic locations. The data comes as plain ascii text = files, and we are having a lot of problems with the quality of the data we are receiving, so a lot of time is wasted trying to load wrong-formatted or incomplete data, finding out where is the offending data and what the problem is, asking the remote administrator to correct and resend the = files, etc... I'm thinking about using python to code a set of scripts that perform = some data validation (format, completeness) of the files *before* they are = sent to us, so any error is detected as close to the source as possible (and = as far from us as possible ;-) in order to minimize this bad-data time = waste. The questions are: - Do you think that Python is a good choice for this task? Please note = that the scripts must run in very differente platforms (NT, *nix, maybe = Mac...). I'm fairly new to Python, and although I'm impressed with it, I'm not = sure about it being really and easily portable unless you're a C & OS = guru... - Is there any module or library specially designed for this kind of = task? (parsing text data files with fixed or variable length fields, = validating date formats, etc...) Big thanks in advance! Juan Jes=FAs Alcolea Picazo - jjalcolea [at] inad **********************************************=20 Noticia legal=20 Este mensaje electr=F3nico contiene informaci=F3n de BT = Telecomunicaciones S.A. que es privada y confidencial, siendo para el uso exclusivo de la = persona(s) o entidades arriba mencionadas. Si usted no es el destinatario = se=F1alado, le informamos que cualquier divulgaci=F3n, copia, distribuci=F3n o uso de = los contenidos est=E1 prohibida. Si usted ha recibido este mensaje por = error, por favor borre su contenido y comun=EDquenoslo en la direcci=F3n = postmaster [at] bt=20 Gracias
|