Architecture
Using Clean Architecture methodology.
Data will either be from automated scrapes or input manually.
Presentation
/sources/indexshow all sources (both scrapable and manual)/sources/{id}show details of a particular source including all scrapes performed against scrapable sources/sources/edit/{id}edit a source/sources/createcreate a new source/scrapes/indexshow all scrapes/scrapes/{id}show details of a particular scrape including all records found/surnames/indexshow all surnames with records/surnames/{value}show details of a particular surname
Application
Commands
IPerformScrapeCommandscrape the specified site for any records with the specified surname and storeICalculateChangesFromLastScrapeCommandcompare results from last scrape to second to last scrape (if available) and get changes
Queries
IGetSourcesQueryget all the different sources e.g. FreeBmdBirths, FreeBmdDeaths, GroBirths, GroMarriagesIGetScrapesQueryget all scrapes independent of source or surnameIGetScrapesBySourceQueryget all scrapes performed against a particular sourceIGetItemsByScrapeQueryget all items from a particular scrapeIGetPeopleBySurnameQueryget all people as merged from the sources for a particular surnameIGet
Domain
Infrastructure
Likely to contain different services for connecting to different sites in order to scrape records from that site
FreeBmdServiceGroServiceAncestryService
Persistence
Items
Locations
ManualEntryItems Id - generated unique id (mandatory) SourceId - foreign Key to Sources table (mandatory) UniqueRecordId - a unique identifier as recognised by this particular source (mandatory) Surname - the surname of the record (mandatory) Forenames - the forenames of the record (optional) FactOrEvent - either Fact OR Event (optional) FactOrEventType - e.g. Birth, Death, Census (optional) FactOrEventDate FactOrEventLocation Data - full data from the source, key value pair
ScrapeItems Id - generated unique id (mandatory) ScrapeId - foreign key to Scrapes table (mandatory) UniqueRecordId - a unique identifier as recognised by this particular source (mandatory) Surname - the surname of the record (mandatory) Forenames - the forenames of the record (optional) FactOrEvent - either Fact OR Event (optional) FactOrEventType - e.g. Birth, Death, Census (optional) FactOrEventDate - date string (optional) FactOrEventLocation - location string (optional) Data - full data from the source, key value pair JSON
Scrapes Id - generated unique id (mandatory) SourceId - foreign key to Sources table (mandatory) Surname - the surname used to perform the scrape (mandatory) StartedAt - datetime the scrape was started FinishedAt - datetime the scrape finished TimeTaken - calculated NumberOfRecordsScraped - calculated
Sources Id - generated unique id (mandatory) Name - short name of the source e.g. FreeBmdBirths, InterviewWithSylvia Description - description of the source including base URL if relevant