My team, the 49ers, are through to the championship game for the 5th time in 11 years despite playing some really awful football all year long. After starting the season 3-5, the niners are on an improbable run and the lesson here, I think, is that no matter how badly you're playing, someone else could be playing worse. Like the Packers missing a field goal and being blocked on a punt or the Dallas Cowboys running the ball with 15s on the clock or in the final regular season game, the Rams completely collapsing in the latter half after being up 21-0. No matter what the reason, you just might win. Just keep doing what you're doing and keep going. You never know where you might end up.
Since the last update, I've been thinking about how helpful these updates are in guiding self-exploration of business ideas and self-directed learning.
I think I've reached the limits of broad-based learning across AI, blockchain and Rust. Going forward for the next 2 months, I'm going all in on Rust, web assembly and blockchain. My current freelance contract involves mostly Rust/wasm and cryptography (not crypto as in cryptocurrency). Meanwhile, Solana is a good space to intersect my Rust and blockchain skills (some other chains like IronFish also use Rust).
Unfortunately, AI plays will have to wait. AI feels to me like a domain that requires some bottoms-up learning to start. It's clear over the last few weekly updates about generative art that I can't just tweak a model to success. I need to understand the underlying techniques and maybe, even some of the underlying math to be productive. I'm looking forward to picking up the fastai course in a couple of months. But focus, for now, will be useful.
Last week, my goal was to play around with the OpenAI API and come up with 2 commercially viable ideas for API services:
SEO keywords API: given an input of content text, the API outputs a list of keywords that are both accurate summaries of the content as well as likely to rank highly on search engines.
File naming API: given a file as an input, the API outputs a list of possible names for the file.
[Not an API] AI-assisted writing tool (convert first person to third person, generate summaries, keywords, titles, outlines) - like copy.ai but way more powerful
[Not an API] Customer Service Chatbot: give it a corpus of support/help articles and a Twitter account whose tone to emulate, the chatbot will be your first line of customer support with some personality.
My other goal was to understand more deeply the generative model I was using and write a blogpost. I don't think my explorations merit a blogpost but I'll describe the broad outline here:
"Diffusion models work by corrupting the training data by progressively adding Gaussian noise, slowly wiping out details in the data until it becomes pure noise, and then training a neural network to reverse this corruption process."
The CC12M_1 CFG 256x256 model I've been using also uses CLIP. CLIP is a model by OpenAI that can be used for captioning images. So, given an image, it can produce a caption.
The CC12M_1 CFG 256x256 model combines the two: given a text prompt like โNYC in the winterโ, it starts with just a random noise image, uses the diffusion model to produce a better image and then, uses CLIP as the evaluation function to make sure the image being generated corresponds to the prompt/caption. It's a pretty neat trick that ends up generating some beautiful images. From my research, I know it's also possible to produce higher than 256x256 resolution images.
With that, bye bye AI-generated art for the time being. ๐
For the week ahead, I have just 1 goal:
๐พ Submit an entry to the Project Serum hackathon