What are we doing here?

This blog includes a series of videos and references to help new users or enthusiasts better understand how to use open source and free technology tools. The quick links includes more information for accessing many of the tools covered along with other references to learn more for taking advantage of these tools.

Click HERE to see the full list of topics covered!

Nextcloud and Ansible

 

This tutorial is really more of a proof of concept of using Ansible and other scripts to more easily deploy Nextcloud from scratch. 

Ansible is an interesting tool for deploying and maintaining server systems. Very much like Docker, it uses YAML to build configuration files called Playbooks, and these Playbooks can be used to automate actions on the managed nodes in the cluster. 

In the deployment, as mentioned in the video, I do not deploy everything that's needed for Nextcloud. The Ansible Playbook doesn't even download and install Nextcloud. This is deliberate because I would think that nodes can be added, taken away, replaced, etc. and the Nextcloud config may want to reside in a more permanent storage environment. For this reason I offer a reference for getting NFS set up in Debian 12, and then mounting it to the web server. 

After the initial web server and Nextcloud install are complete, and the config.php file is modified as needed for Redis, Memcache, etc., subsequent updates and changes to web server nodes can be made. I even noticed, while recording, that I had missed memcached package in the yaml, but a quick change and re-running the configuration brought it in. 

The source for the Ansible playbook, as well as single-node scripts for installing Nextcloud from scratch is available on Github. 

https://github.com/JoeMrCoffee/AutomateManualNextcloudInstall/

Printers and Linux

productName

For years I've been using public printing options to print anything that needs printing. Most all of my data can be digital, and I've become rather fastidious at managing and maintaining digital data. Living in Taiwan, paper gathers dust, gets dirty, yellows, and can easily become moldy. However, there are times that you just need to print something.

In the past (up until yesterday), if I needed to print something I would simply copy the file on my USB drive and plug that into whatever kiosk or computer the convenience store or print shop had and print. I read the waiver for years saying something to the order of "we may keep an image of what is being printed....." for whatever reason, and tapped "Ok" and paid the few cents per page printing fee. I've never liked doing it, but I print so rarely, I thought it was the best middle ground.

However, with all this AI mumbo jumbo becoming ubiquitous, I think the margin for error on the parts of these "tech companies" is just growing. The latest announcement from Microsoft introduces "Windows Recall", and that was the straw that broke my camel's back. The layers of fat that had put up with so much over the years, could protect the poor camel NO MORE. This is a ridiculous, unneeded, and potentially dangerous feature that no one is asking for, aside maybe from investors asking Microsoft why they need to spend billions of dollars on the letters 'A' and 'I'. 

Now, having been a Linux user almost entirely for 6 years, I'm not too worried about this Windows feature. However, where would I expose my data on a Windows system? When I print something, because everyone uses Windows at the Kiosk, at the print shop, etc. Microsoft claims the data is "private and on device", but not if it isn't my device - if a Windows PC is anyone's personal device in the first place. 

More info: https://arstechnica.com/gadgets/2024/05/microsofts-new-recall-feature-will-record-everything-you-do-on-your-pc/

So, with all that said and ranted about, I decided it was time to purchase an actual printer. After exploring some options, I settled on the HP M141W which is an entry-level black-and-white laserjet with a scanner. It has network access, but I decided to just connect via USB. The tinfoil hat remains in place.

The packaging listed all the steps, apps, drivers etc for every platform and claimed to require Internet access. but on Linux the printer definitely works and does not require to connect to the Internet. Simply ensure the packages 'hp-ppd' and 'hpijs-ppds' are installed (at least that is what I tested on PopOS / Ubuntu / Debian). The scanner worked immediately, while the printer needed to be power cycled after the driver package installed. 

Based on my needs, I am hopeful this will last a good while. Printer support in Linux is known, these days to be quite robust thanks to the CUPS driver stack most printers will be supported, but always check. I ensured that HP printers, as a policy, have Linux driver support, and just wanted to share the steps to get it up and running.


File Storage and Sharing for Creators in 2024

This article explores the many ways content creators, photographers, videographers, and production companies can leverage open source tools to store, share, and collaborate on their media assets. Different methods can depend on the size the of the team - from a single individual to a multi-national teams - and examine the pluses and minuses of each approach.

A single creator

Creators making and storing large files have a plethora of options available. The most simple is just to copy files onto ones workstation or laptop, edit, and publish. Often this quickly becomes a problem, particularly for individuals working with 4K or even 8K content. For these editors, often the simplest approach is to simply use an external hard drive(s). This approach is perfectly viable, but can quickly become an issue as users fill up more and more hard drives.

Another issue related to scale, is performance. A direct attach drive over USB or USB-C in theory can handle upwards of 10 Gb/s, but larger drives that are still traditional spinning rust (a normal hard drive that spins) will have a max throughput of around 250 Mb/s - 1/4~1/2 the throughput of the interface. NVMe SSDs are available and becoming ever more cost competitive at the 500 GB, 1 TB and 2 TB sizes, but will be noticeably more expensive than traditional hard drives at higher capacities. However, even though external SSDs are performant, they are not redundant - so manual backups will be needed, and the SSDs will eventually wear out. Always make sure data is backed up in some form. Having external drives over time can also be unwieldy with lots of data since the storage per drive cannot be expanded, meaning content creators will often have a pile of drives carefully labeled with different projects potentially getting spread across multiple drives. Essentially sprawl.

The introduction of 8K footage, for production houses and creators is another major issue. 8K footage is truly massive in capacity creating upwards of over 120 GB of content per minute.* An external hard drive or SSD will quickly get filled potentially within a single shoot. Creators need more storage and in a different format to keep up.

Upgrading to a NAS

Network Attached Storage (NAS) is, as the name suggests, storage that is accessible over a network. What it means in practice is users of a subnet (IP range) can access and share files that are located in a single or multiple servers. In Windows land with Active Directory this feature is just the share file feature in the Windows Explorer. Typically when talking about 'a NAS' usually IT administrators refer to a specific server designed with storage in mind that has a drive management, RAID, a file system, and the ability to share the files using a file share protocol. The most common protocols are SMB or Samba (open source compatible SMB), NFS, and AFS. For most in the creator or video production space, SMB will be the primary protocol because it is well sported on Windows, Linux, and macOS environments. Even iPadOS for iPad devices has some support for SMB in the files app.

Moving from a single or multiple external drives to a NAS has several benefits. First most NAS appliances or software projects will have the ability to create a RAID group to span multiple drives together into a single storage pool. This is useful so that multiple hard drives or SSDs can be grouped into a larger total capacity than any single drive would offer, allow projects to all be grouped together in a single master folder. Additionally, RAID will allow for greater performance for reads and writes as it offers more drives and total bandwidth, plus RAID will offer some level of redundancy to help keep data available. Another advantage of using a NAS is that data can be shared across groups, no direct cables need to be plugged in, and everyone on the network can work off a joint project or folder(s). A NAS is always one of the first steps once creators move from a one man operation to a larger group.

CAUTION: RAID is not a back up, and a second pool that is perhaps larger in capacity, but slower in performance is always recommended to back up the data to.

Choosing or building a NAS

The likes of QNAP and Synology, or Asustor offer entry-level NAS appliances which are good first steps. Typically, the entry-level boxes are rather under powered, however, and very limited in terms of how many drives one can use, etc. For non-technical users perhaps an entry-level NAS makes sense, but building one on perhaps old or leftover hardware with new drives can often have more performance - plus reduce e-waste!

Users interested in building a NAS can look at a variety of open source projects, such as TrueNAS, Open Media Vault, Unraid, and more. Personally, I recommend TrueNAS as it is well supported, has a corporation maintaining the project with Enterprise options for larger organizations, and offers an attractive GUI available via a web browser for setting up and managing the drives, creating users and shares. TrueNAS also has a native implementation of the ZFS file system which is extremely robust with built-in RAID support, copy-on-write operations, unlimited snapshots, and almost unlimited scalability - 256 quaddrillion zetabytes. For perspective, that is similar to buying the entire storage market of all hard drives in a year and connecting them all together. ZFS can also be expanded buy adding more RAID groups (called VDEVs) to a pool so storage can always be expanded. ZFS also has a replicate function called 'zfs send' which can send a snapshot(s) of data to a separate pool either on the same or different host quickly. The second pool or the backup pool, can have completely different hardware, different RAID layout, etc., but the ZFS file structure can still operate and be recovered usually in seconds should there be a need. TrueNAS supports all the major NAS protocols, as well as WebDAV for HTTP/HTTPS transfers, and has the ability to expand functionality with 3rd party projects, VMs, Jails (TrueNAS CORE) or containers (TrueNAS SCALE), making the project quite versatile.

For users interested, more information about getting started with TrueNAS is here.

Cross-site and International Collaboration

In 2020, the world was introduced to lockdowns, disease, and working from home gained unheard of traction and interest. The old adage "necessity is the mother of invention" was never truer. Knowledge workers, including in the creative space, were some of the first to move to working from home leading to a major shift in the office paradigm and a boom in laptop sales. File access was suddenly something that needed to be reinvented.

For users connecting remotely, often a NAS will not be the correct choice, or at least not the total solution for a few reasons. Fist, remote workers are remote and on a different network. NAS protocols - the aforementiond SMB, NFS, AFS - are not built for Internet file access. Most NAS protocols expect a constant connection to the files and will create file locks for open documents. HTTP/HTTPS traffic was designed to handle gaps and mulitiple hops - routing between different servers and routers - when accessing files and is thus the preferred protocol for nearly all Internet-based traffic.

Another important reason not to expose a NAS to the Internet is security. Virtually no NAS provider ever recommends a user to expose the system to the Internet as the appliances are built for back end storage work over a LAN. Especially when using some proprietary systems, there is very little to any auditing being done on the system's firmware and base OS code. Examples abound.**

For multi-site, international collaboration, the most secure and reliable way to access files is via the same medium that gave birth to the Internet - a website. Nextcloud is a total collaboration platform for storing, sharing, and creating documents and files. It includes powerful tooling and apps to track notes, create user / team tasks, manage groups and access, create survey forms, and much more. For creators looking to collaborate with other team members, Nextcloud can even mount an local NAS to the platform so that users on the network editing video, sound or image files from the NAS can then share their results via Nextcloud using secure HTTPS without having to copy the collateral to the platform. The platform has robust file versioning, and with customizable logos and an app-based model for enabling different functionality, Nextcloud can be customized to almost any workflow desired.

Nextcloud is installed as a website and can be run with either Apache or NGINX web servers. The project has several ways to install and get started - raw source, bespoke VM images, or Docker images. Since the platform is built around web servers, it can adhere to the most robust TLS/SSL encryption standards that are well established, with additional security that could be added using load balancers and firewalls possible.

More information on getting started with Nextcloud is here.

Putting it all together

For industries dealing with or creating large files, there are a multitude of ways to store, share, and protect data. For individual users, local storage could be enough, but will quickly fill up and become hard to manage. Networked file storage in the form of a NAS system, make storage and file management easier, and also allow for teams of editors to more easily work together. Growing even larger, or for teams spread out across different locations, Nextcloud is a total platform that is both secure and capable not only for file sharing and storage, but also group collaboration.

Ref:
*8K file sizes https://www.signiant.com/resources/tech-article/file-size-growth-bandwidth-conundrum/
** Synology and WD vulnerabilitys: https://www.securityweek.com/western-digital-synology-nas-vulnerabilities-exposed-millions-of-users-files/
** Asustor vulnerabilities: https://www.theverge.com/2022/2/22/22945962/asustor-nas-deadbolt-ransomware-attack
Get TrueNAS: https://www.truenas.com/
Get Nextcloud: https://www.nextcloud.com/