Skip to content

conversiontools.docsorter #

Each document we will sort need to have e.g. [aac] in the name, the aac is the id of the document

How to use

  • documents can be downloaded from any source and put in a directory which is the the source of the information

example

#!/usr/bin/env -S v -gc none -no-retry-compilation -cc tcc -d use_openssl -enable-globals run

import os
import import freeflowuniverse.crystallib.conversiontools.docsorter

docsorter.sort(
    path: '/Users/despiegk1/Downloads/pdfcleaner'
    export_path: '/tmp/export'
)!



example instructions file:

aaa:ourworld:kristof_bio
aab:phoenix:phoenix_digital_nation_litepaper:Litepaper of how a Digital nation can use the Hero Phone

the first is the id, 2nd is name of the collection, the 3e is the name, and 4e is optional description.

usage through heroscript

NOT IMPLEMENTED YET


!!docsorter.settings collections_path:'' 

//the following will download the doc from google drive, will only work if doc is public available
!!docsorter.pdf_copy id:'aaa' name:'ourworld_investment_memo' type:'pdf' collection:'ourworld'
    url:'https://docs.google.com/document/d/1sjh2K6iay86H9Gd83gY04bVDSj4brxADEWQMVmDq0SQ'
    description:'OurWorld Investment Memo Nov 2024'
!!docsorter.canva_export ...

fn sort #

fn sort(_args Params) !DocSorter

struct Doc #

@[heap]
struct Doc {
pub mut:
	id              string
	path            string
	name            string
	description     string
	collection_name string
}

struct DocSorter #

struct DocSorter {
pub mut:
	docs []&Doc
	args Params
	py   ?python.PythonEnv @[skip]
}

fn (DocSorter) doc_exists #

fn (pc DocSorter) doc_exists(id string) bool

struct Params #

@[params]
struct Params {
pub mut:
	path         string
	instructions string
	export_path  string
	reset        bool
	slides       bool // if we exctract slides out of the pdfs
}