[OT] Going "paperless"

- T
- Tim Watts
  
  Contact options for registered users
posted
11 years ago

Wed, Jan 9, 2013 1:49 PM

Bit of an "Ask Slashdot" but with so many creative people here :)

Long story short, I want to get organised and dump my more useful paperwork into a computer. Have scanner.

Purpose - easy location of important documents and the ability to shred paper copies of most of them (bar the *really* important ones)

I've done this very manually before with a local filesystem and symlinking the scanned file into various category directories - but it is laborious and error prone.

I've looked at cloud stuff and tried half a dozen linux based Document Management Systems (Logicaldoc, Alfresco OpenKM and others and they are all deficient in one way or another except for the cloud based service "Evernote.com"

I love Evernote.com but there's no way I'm putting my bank details or medical data there unencrypted.

It has *exactly* the featureset (bar encryption) that I think worthwhile:

a) Easy upload

b) Multiple clients - web, android (I might want to pull up my motor insurance on a whim on my phone)

c) Easy tagging (categorisation) with multiple categories per file

d) Easy browsing by category

So I might file the kid's allergy stuff under [Children] and [Medical] and maybe [School]

==========================================

So the question is:

Anyone know of something like evernote that could be run off a linux server under my control

or

How to handle on the fly encryption of selected documents from both Android and a desktop linux client on the way into and out of Evernote? (Without a painful amount of effort that is)

=========================================

If no solutions prevail and if I *had* to implement something, it might look like this:

I would upload well structured file names to my home server, eg

DOCUMENT_NAME-CATEGORY1-CATEGORY_NUMBER_2-#20140120.pdf into a flat folder, then a crontab'd perl script would create symlinks into

./Categories/CATEGORY1/DOCUMENT NAME.pdf ./Categories/CATEGORY NUMBER 2/DOCUMENT NAME.pdf

(ie the hyphen is a tag/entity seperator, underscore just means SPACE)

and another perl script would note that the document expires (optional) on

20-01-2014 and archive it then.

I'd present the category tree read only via webdav or something.

It would work well enough for document retrieval, but it's not very flexible for ad hoc document uploading, eg from a phone.

Not looking for document introspection (eg OCRing scans or full text searches). Don't need "workflow management". Just good solid filing and easy access abilities...

- J
- Jethro_uk
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 1:59 PM

on a related topic, how about the ability to create or work with cross- index categories ? Some documents can get "lost" because they can't be naturally assigned to a single folder. Especially if two people who do the filing (looks at SWMBO) have different ideas about where a document fits.

- T
- Tim Watts
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 3:01 PM

On Wednesday 09 January 2013 13:59 Jethro_uk wrote in uk.d-i-y:

That's where the adding of multiple categories comes in (not that SWMBO would guess all the same as me, but likely she'd come up with enough overlap). It can help to keep less categories too rather than getting too complicated :)

Answering my own question, slightly, but not helpfully:

Evernote do not have any client side encryption yet (been asked for since

2011 in their forums). However Nevernote/NixNote is an open implementation of an Evernote client I suppose someone could make a suite of client side encrypting clients (iPhone, Android and a Java Applet would be the minimum set for starters).

Also found

aes.io

who are producing a similar service with client encryption - but it's in Beta and no mobile clients yet...

Must be a market, surely. Or perhaps not enough people are "smart" about dumping their personal sensitive data onto the cloud....

- T
- Tim Watts
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 3:05 PM

On Wednesday 09 January 2013 13:59 Jethro_uk wrote in uk.d-i-y:

Oh and I tried GPG encrypting a file then uploading it. Worked, but the process is far from smooth...

- B
- Brian Gaff
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 3:44 PM

As a matter of interest, how do people deal with documents that only exist in printed form. I was interested because increasingly companies are saying we have electronic archives if you want to have a look. However its totally pointless I've looked as all they seem to have done is scanned in the document and left it as an image and therefore its completely blank to a blind person, as a rule, I ocr stuff but of course some paperwork is so flimsy and faint that this does not actually work, error prone or whatever as you say. Brian

- H
- Huge
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 4:01 PM

That one ... :o(

- T
- The Natural Philosopher
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 4:48 PM

digital camera is quicker mainly.

TBH are you able to constrict simple PHP and mysql stuff?

I built a database intended for TIFF drawings set. Simple web interface to :LAMP setup (Linux/Apache MySql/Php)

Upload data and use forms menus to 'download' it when you need it.

Takes care of all the 'where to put the files' stuff.

Can be run on a home server or a virtual hosted web server for wider access..

Happy to send you code samples.

thats just making a flat file database. Use MySQL

One table contains e,g. scanned/photoed images and a description, and a field called 'category' that contains a unumber. Also things like when uploaded last etc. Whatever.

Category number indicates which category the data belongs to.

so you can search on name, descripotion, when inserted, categoiry, sub category..(no reason not to have categiry numbers for CATEGORIES to allow tree structures, if you take care not to make a category a branch of itself, or one of its children.)

IN my case I went full BOM so uou can have 'sets' of drawings as well. And drwings that can be oart of different 'sets'

But it was for a manufacturer..

Mysql for the grunt work, PHP to hack together access screens.

Then backup the entire database onto a separate disk nightly using e.g. rsync.

Setup here is twon 500GB disk ATOM based Linux, with one disk for live work, the otrher takes an obernight incrementyal snapshot - covers disk failure in one of the disks.

System runs mysql/apache/php

web forms access te data via PHP scripts. binary data uploaded as is, and stoired in by fields in tables, cross indexed with categories etc.

New categories possible easily.

retrieval simply throws the data back as a download.

The good thing about DIY ing it is that it ends up teh way YOU want it

I did it vbecause I had more time than money, and no solutions existed for what I wanted it to do.

AS I said, happy to help design a system for you, if you want to go that route.

In fact it might help clear my desktop, too.

>

- T
- The Natural Philosopher
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 4:49 PM

piece of piss with SQL database.

If you e.g want the CH manual to be under 'CH' and 'manuals'

- J
- Jethro_uk
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 5:02 PM

I was angling to see if anyone knew of any software which allowed it ;)

Drifting gently more off topic, there used to be a science to categorization, of which library stocking was a small part. I wonder if that's a skill which is being slowly lost to the past ...

- T
- The Natural Philosopher
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 5:09 PM

I'll tell you what,. if people are keen to have such systems on their servers, and there are other decoders/testers/debuggers out there..we should design and build this. Its really NOT that hard.

encryption of the data in transit is catered for using https protocols.

Sod public clouds. Freeware home clouds are better..

- B
- Bernard Peek
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 5:59 PM

Encrypting material in a document management system is of dubious value. In order to search the document contents you need to decrypt each document for each search. The alternative bodge is to store the document and its metadata separately but leaving loose metadata around is almost as risky as having unencrypted data (which is not very risky at all.)

- T
- Tim Streater
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 6:06 PM

No, use SQLite if it's just for home use with occasional access. That way there's one less server. That's how I've done our home addressbook database. HTML, JavaScript, PHP, and SQLite.

- B
- Bernard Peek
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 6:11 PM

Far from it. With the growth of the amount of data being created and stored it is getting more important to find a way of retrieving it once stored. There are still lots of librarians, mainly using Dewey or Library of Congress subject headings. There are lots of specialist coding systems too. It's getting to be a more useful skill as organisations build their own document management systems with bespoke classification structures. Google for ontologies.

If you have lots of documents created using Microsoft Office then Sharepoint is a possibility. It's particularly useful if you use the Word function to populate keywords automatically.

Setting up a Sharepoint server is not a trivial exercise. There is an open-source product with some similar functionality, "O3 Spaces" IIRC.

- T
- Tim Lamb
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 6:40 PM

In message , Jethro_uk writes

Exactly. Addresses can be Christian name or surname.....

- T
- Tim Watts
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 6:56 PM

On Wednesday 09 January 2013 17:59 Bernard Peek wrote in uk.d-i-y:

It's of great value to me - because I do not want to search the contents. All I want to do if have neat filing where I can find the document because it's well filed (via several possible paths - categories).

Full text searching might be cute but it comes way down below "secure" storage.

It's not even an issue as to whether evernote are trustworthy or note. It's also about hacking[1], US Patriot Act and similar concerns.

I could get my sever hacked, but at least it's under my control so I'm fully responsible...

- T
- Tim Watts
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 7:00 PM

On Wednesday 09 January 2013 17:09 The Natural Philosopher wrote in uk.d-i- y:

It is for me - I really cannot do GUIs in any shape or form and that includes web guis. I struggle with a simple stateless POST form, let alone anything with state!!

I could do the DB and backend code, but the other problem is right now, no spare time either.

I am surprised something like this doesn't exist - or at least in a form that's any good. I'm trying OpenKM right now - it's another DMS (they are all written in Java running under tomcat for some reason).

If the Webdav interface looks viable on my phone, I might go that route...

Totally agree...

- T
- The Natural Philosopher
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 7:50 PM

what res are phones? typically. I think I designed the last system round

1024 x 768 but this should be a lot less screen hungry.

I'll think about this, and stick something to play with up on a public webserver.

If I have time to do it.

- D
- dennis
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 7:53 PM

If you can actually get something as good as PaperPort let me know as I just gave up with Linux apps for document management and spent the cash.

- A
- Andy Burns
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 8:02 PM

Last year's models are around 1280x720, this year looks like 1920x1080 is the must have size ...

- T
- The Natural Philosopher
  
  Contact options for registered users
Vote on answer
posted
11 years ago

Wed, Jan 9, 2013 8:46 PM

Holy crap. So much on such a piddly screen?

I'll see what I can do.