I have been trying to figure the best way to play and manage film scans and learned a bag of new tricks from reto.ch!
Playing a DPX sequence
ffplay[DIRECTORY]/Scan\_%06d.tif
The regex %06d matches six digits long numbers, possibly with leading zeroes. This allows to read in ascending order, one image after the other,the full sequence inside one folder. Of course, the command must match the naming convention actually used.
f image2 forces the image file de-muxer for single image files
framerate sets the frame rate to 24
NOTE: The previous two parameters must be before the input file, because they are applied to the input file.
i path, name and extension of the input file
The regex %06d matches six digits long numbers, possibly with
leading zeroes. This allows to read in ascending order, one
image after the other, the full sequence inside one folder.
The command must of course match the naming convention
actually used.
c:v chooses the ProRes video codec
profile:v the flavour ProRes 422 HQ has the video profile 3
filter:v filters the video stream:
_ scaling to the correct size
[we use the Lanczos scaling algorithm which is slower but
better than the default bilinear algorithm]
_ padding the 4:3 format into the 16:9 HD format with pillar box
Make an access file H.264 directly form the conservation files TIFF.
c:v chooses the H.264 codec by using the libx264 library
preset chooses the very slow preset which gives the best result
qp a quantisation parameter of 18 means “visually lossless”
Managing digital data has become increasingly challenging and can be a daunting experience with the sheer volume of what can build up in a short period. Digital data is vulnerable and can be easily lost if not properly cared for. Hard drives fail, files get plagued by bit-rot, or it can get deleted by accident. Data loss is painful, and recovery (if even possible) is an expensive and stressful process. With the advancement of born-digital filmmaking technologies, we are generating more data than ever before and this poses a huge risk to the loss of our films if we do not start taking steps to care for the physical longevity of a work as it is being created.
In the film archive, many of the titles we receive in born-digital formats require our staff hours to sieve through and sort out the clutter. This is usually because the data is not properly organised or lacks sufficient documentation (for instance, poor file-naming conventions) making it difficult to know what is being filed.
Since many people are working from home as a result of COVID-19, it is a great opportunity to take stock and review the backup of your files and check if you are still able to access your digital films and related materials. No one else knows your work better than you. So ensuring the integrity of your materials is a task best done by you.
We have compiled some practical tips and basic data management concepts that is easy for anyone who is planning to organise and backup their personal files at home. There is no right way to do this and you may find other solutions that might better suit your needs.
For the long-term preservation of your film works, consider sending it to the Asian Film Archive. We will assess if it falls within our acquisition policy. Click here to provide us with some information.
Let’s start!
Consolidate your files
Tracking and managing your files gets challenging when they are kept on multiple and different devices. It’s easy to lose track of its location if some are sitting on hard drives and others are on online platforms, such as Dropbox, Google Drive and OneDrive. Identifying where the files are is the first step in taking stock of what you have.
You could centralise the files on your computer or an external hard drive once you have done locating your files. Remember to select a suitably sized media to do all of this.
At this stage, everything will look like a huge mess. Take the time to survey what you have because this will inform you on how to organise and name them later.
By the end of this process, you will have a good sense of how much data you have and knowing this will help you allocate the necessary memory for your backup strategy.
Keep only what you need
Select what you need and delete whatever that is unimportant. For example, you might encounter multiple copies of a file and it could be worth considering what is sufficient to keep, for instance, keeping only the latest version.
This quality check process will free space, keeping your data volume to a minimum. Working with a smaller volume will keep costs low and allow an easier migration and backup process down the road.
Organise your files
Organise the files that you have selected by creating a file directory structure and a file naming convention that makes sense to you and others accessing the files. Whatever system you decide upon should be easily understood by you and others to ensure easy accessibility and quick identification.
File Directory Structure
This is an example of how files can be organised and structured:
find it useful to first determine the top-level folder. I used [YEAR] in this example, and branch into sub-directories by [CATEGORY]. In the screenshot above, we see Year 2020 and the categories are AFA Work, External Projects and Personal Documents. The later directories might be projects/events based but try to keep them as consistent as possible.
There is no ‘perfect’ format and this example gives you an idea on how you can start. Your directory should be intuitive and logical and should be guided by how you work.
Here are some tips on designing your own directory.
Draw it out!
Draw your directory out on a piece of paper before implementation
Keep it simple
Avoid complex and deep-layered designs
Consistency
Keep a consistent structure across folders
Be precise
se plain language and keep it short
Avoid spaces, punctuations and symbols
Some operating systems do not recognise spaces so avoid spaces in a mixed operating system environment
File Naming Convention
Descriptive file names are imperative to quick identification and retrieval of files. Poor naming conventions are frustrating and wastes a lot of time since they do not give useful information (“Best practices for file naming”, 2020).
A guiding principle for filenames is to include basic information such as object type, dates, and important remarks. These indicators can be crucial in distinguishing one file from another in the event that there are multiple variations of a given file. In essence, an effective file name should tell you what the file is without you having to open it (Antin, 2020).
In the screenshot above, assets of a project are organised by their folders: Logos, Mock_Up_Thumbnail and Watermarked_Stills. The name of the three files clearly indicates that there are three image stills (Still) from the film Sunshine Singapore (SS) watermarked (WM) in .png format.
Now that your files are all in place and organised, it is time to back them up! You can consider the classic 3-2-1 data protection strategy which is a model widely adopted by professionals in content and media production
The 3-2-1 strategy
Keep 3 copies of your data
The 3-2-1 strategy encourages the back up of three copies of data because one copy is simply not enough. Having one copy is dangerous and the more copies you have, the lesser the chance of complete data loss.
Store 2 copies on 2 different storage devices
Drives will eventually fail because of mechanical failure or wear and tear. Hence the 3-2-1 strategy recommends keeping your first two backups on two separate storage devices at your primary location. Storing the two copies differently will provide an added insurance for data restoration in the event that one source fails.
There are various backup storage solutions available, but which do you choose? Here are two solutions you can consider.
A) External hard drive (SATA & SSD)
This is the most common solution because it is affordable, easy to use and widely available. An external hard drive requires little set-up, just plug it into your computer via USB and it is ready to use. They usually come in two variants: SATA & SSD and their differences are in the links provided. SATA drives are much cheaper but SSD’s are faster and less prone to failure since there are no moving parts in it. However it is important to note that SSDs have limited write cycles, even though it is less susceptible to physical wear (“SATA vs SSD vs NVMe: Types of Hard Drives”, 2020).
Pros: Easy to use, portable, affordable, widely available
Cons: Can’t easily share files
You can consider deploying multiple hard drives and rely on tools/software to help duplicate data from one drive to another.
A NAS storage device is connected over a computer network and acts as a central location for multiple users to write and access data (“What is NAS (Network Attached Storage) and Why is NAS Important for Small Businesses?
Seagate UK”, 2020). Depending on the model, multiple hard drives are housed within a NAS for storage and can be scaled for increased capacity depending on its number of available bays. You can think of NAS as an array of hard drives put together to form a larger storage unit.
It can be set up to use a RAID configuration to ‘create’ multiple units of storage within the NAS but still behaving as one cohesive storage. This allows you to manage storage redundancy and performance according to your needs. There are many NAS solutions available but Synology and QNAP are two popular brands.
Pros: Allows multiple users to access, scalable, allows customization, status monitoring
Cons: High upfront cost, requires basic technical knowledge
The last component of the 3-2-1 strategy is to have 1 copy of your data stored off-site away from your primary location. This is a key component in designing a robust back up strategy as on-site file storage can be compromised by hardware failure, theft or a fire. You can consider cloud storage as your off-site solution where your files will be hosted on a cloud service for a cost (“Backup Strategies: Why the 3-2-1 Backup Strategy is the Best”, 2020).
There are many cloud solutions plans you can consider that can cost as low as USD6 per month for unlimited file storage. Most of these solutions keep data restoration straightforward as well. For example, they can restore your data in different ways: direct download, USB flash drive or external hard drive. Whichever method you pick will depend on the volume of data you are retrieving.
Conclusion
There is no one way to manage and backup your data since it will depend on your data volume and importantly, your budget. When it comes to designing your backup system, a general rule of thumb is to have multiple copies and diversify them on different storage solutions. If one source fails, you can rely on others for data restoration.
Managing your data requires continuous effort and it is easy to overlook it with competing responsibilities. Hence you can consider a backup regime or simply backup as you go. By doing so, you not only reduce the risk of data loss but also have peace of mind that your files are safe and are readily accessible when needed.
Reference List
Antin, K. (2020). File naming conventions: why you want them and how to create them. HURIDOCS. Retrieved 11 October 2020, from https://www.huridocs.org/2016/07/file-naming-conventions-why-you-want-them-and-how-to-create-them/.
Backup Strategies: Why the 3-2-1 Backup Strategy is the Best. Backblaze Blog Cloud Storage & Cloud Backup. (2020). Retrieved 11 October 2020, from https://www.backblaze.com/blog/the-3-2-1-backup-strategy/.
Best practices for file naming. Stanford Libraries. (2020). Retrieved 10 October 2020, from https://library.stanford.edu/research/data-management-services/data-best-practices/best-practices-file-naming.
SATA vs SSD vs NVMe: Types of Hard Drives. Pluralsight.com. (2020). Retrieved 11 October 2020, from https://www.pluralsight.com/blog/it-ops/types-of-hard-drives-sata-ssd-nvme.
What is NAS (Network Attached Storage) and Why is NAS Important for Small Businesses?, Seagate UK. Seagate.com. (2020). Retrieved 10 October 2020, from https://www.seagate.com/sg/en/tech-insights/what-is-nas-master-ti/.
I recently had to provide a list of files in a USB thumbdrive/stick and learned a sweet trick via CLI. This method provides a text, Word or Excel file that lists all the files and folders inside a specific directory within your computer.
dir lists all the files and folders contained in the folder
/s will list all the files in the subfolders as well
Output.doc is the document file containing the entire directory and the details. It may be in another format such as a simple .txt file which can be edited on Notepad.
Listing only certain types of files
The command will be:
dir/s\*.pdf>output_pdf.doc
(.pdf) is a wildcard function that will only select .pdf files.
List bare format (no heading, sizes or summary)
The switch /b will list file names, however when displaying subfolders with dir /b /s, the command will return a full pathname.
dir/b/s>output_pdf.doc
Using the tree command instead of dir
This command will produce a tree listing of the current directory
tree/f>output.txt
/f displays the names of the files within each directory listed.
/a may be used to specify alternative (ascii) characters to be used to draw the tree diagram so that it can be printed by printers that do not support the line and box drawing characters.
In a YUV data structure scheme, the ‘Y’ represents the luma value, and the ‘U’ & ‘V’ represents two chroma values. In contrast to RGB, the values represent the intensities of red, green and blue channels in the pixel.
Each unique ‘Y’, ‘U’ and ‘V’ value comprises of 8-bits (or one byte) of data.
Y value = Luminance value
Overall brightness of the pixel. It is a grayscale value.
U (CB) value = Chrominance value
Specifically the Blue
V (CR) value = Chrominance value
Specifically the Red
The U & V values are coordinates instead of brightness values with positive and negative values.
Why not stick to RGB?
Colour and brightness information and combined within the three channels in RGB. i.e. Increasing the R channel will increase both the colour and brightness value in tandem. In simpler terms, both properties are combined in the same value.
In a YUV system, the brightness information is completely separated from the colour information. In other words, more control is afforded.
Practical application
Backward compatibility - Black and White TVs can’t take RGB signals since the brightness value is baked into the colour values which it doesn’t understand. Whereas YUV has the Y value which can be processed by the television set, by ignoring the colour components.
Chroma sub-sampling - YUV allows the user to remove information specifically from the colour values without affecting the overall luminance of it. This is particularly useful during image compression and an effective means in processing images.
This post will be part of a series documenting the network setup of my new apartment. The apartment is over 20 years old hence a major overhaul is required and the renovation is a great opportunity to lay CAT6A in the house for the 1Gbps/10Gbps network 🤓.
My research began with this article on patnotebook.com detailing the conversion of telephone points to RJ45 data points in newer BTO (Build to Order) flats in Singapore. Newer flats in Singapore have CAT6 cables already nicely laid in the house, however many of the points are terminated with a telephone jack instead of RJ45. Based on the article, it seems that BTO flats only have one point terminated as RJ45, which is definitely not enough 😁.
Older resale flats like mine do not come with any data points hence I will have to lay my own cables and design my own network infrastructure from scratch. On top of patnotebook.com, I’ve relied on many youtube videos and blogs to learn basic networking concepts and the different components required to run a network.
I’ve devoured Evan McCann’s tech blog since I plan to purchase Ubiquti hardware for the network. There are many nice articles that explain networking jargon in simple terms and breaks down the Ubiquiti universe in sizeable chunks. He is extremely detailed in doing breakdowns and making comparisons on the many Ubiquiti products as well. I would recommend his site as essential reading for whoever is considering putting together a Ubiquiti system!
I plan to update my progress with subsequent posts, but for now I’ve made a simple diagram detailing the essential hardware required for the network setup.