Category: Artificial Intelligence

  • Finding the Best Bang for the Buck in Generative AI Hardware

    Desktop PC with NVIDIA RTX 3090 Founders Edition GPU

    As I documented last year, I made a substantial investment in my computer workstation for doing local text and image generative AI work by upgrading to 128GB DDR4 RAM and swapping out a RTX 3070 8GB video card for NVIDIA’s flagship workstation card, the RTX A6000 48GB video card.

    After I used that setup to help me with editing the 66,000 word Yet Another Science Fiction Textbook (YASFT) OER, I decided to sell the A6000 to recoup that money (I sold it for more than I originally paid for it!) and purchase a more modest RTX 4060 Ti 16GB video card. It was challenging for me to justify the cost of the A6000 when I could still work, albeit more slowly, with lesser hardware.

    Then, I saw Microcenter begin selling refurbished RTX 3090 24GB Founder Edition video cards. While these cards are three years old and used, they sell for 1/5 the price of an A6000 and have nearly identical specifications to the A6000 except for having only half the VRAM. I thought it would be slightly better than plodding along with the 4060 Ti, so I decided to list that card on eBay and apply the money from its sale to the price of a 3090.

    As you can see above, the 3090 is a massive video card–occupying three slots as opposed to only two slots by the 3070, A6000, and 4060 Ti shown below.

    The next hardware investment that I plan to make is meant to increase the bandwidth of my system memory. The thing about generative AI–particularly text generative AI–is the need for lots of memory and more memory bandwidth. I currently have dual-channel DDR4-3200 memory (51.2 GB/s bandwidth). If I upgrade to a dual-channel DDR5 system, the bandwidth will increase to a theoretical maximum of 102.4 GB/s. Another option is to go with a server/workstation with a Xeon or Threadripper Pro that supports 8-channel DDR4 memory, which would yield a bandwidth of 204.8 GB/s. Each doubling of bandwidth roughly translates to doubling how many tokens (the constituent word/letter/punctuation components that generative AI systems piece together to create sentences, paragraphs, etc.) are output by a text generative AI using CPU + GPU inference (e.g., llama.cpp). If I keep watching for sales, I can piece together a DDR5 system with new hardware, but if I want to go with an eight-channel memory system, I will have to purchase the hardware used on eBay. I’m able to get work done so I will keep weighing my options and keep an eye out for a good deal.

  • How I Guide Stable Diffusion with ControlNet and Composite Images

    GIMP showing a multi-layer image of Lynn Conway on the right and her co-authored textbook Introduction to VLSI Systems on the left.

    For the illustration of Lynn Conway and her co-authored textbook Introduction to VLSI Systems at the top of yesterday’s post, I used a locally hosted installation of Automatic1111’s stable-diffusion-webui, the finetuned model Dreamshaper 5, which is based on StabilityAI’s Stable Diffusion 1.5 general model, and the ControlNet extension for A1111.

    Stable Diffusion is an image generating AI model that can be utilized with different software. I used Automatic1111’s stable-diffusion-webui to instruct and configure the model to create images. In its most basic operation, I type into the positive prompt box what I want to see in the output image, I type into the negative prompt box what I don’t want to see in the output image, and click “Generate.” Based on the prompts and default parameters, I will see an image output on the right that may or may not align with what I had in mind.

    Automatic1111's stable-diffusion-webui image generating area

    For the positive prompt, I wrote:

    illustration of a 40yo woman smiling slightly with a nervous expression and showing her teeth with strawberry-blonde hair and bangs, highly detailed, next to a textbook titled introduction to VLSI systems with microprocessor circuits on the cover, neutral background, <lora:age_slider_v6:1>

    I began by focusing on the type of image (an illustration), then describing its subject (woman), other details (the textbook), and the background (neutral). The last part in angle brackets is a LoRA or low rank adaptation. It further tweaks the model that I’m using, which in this case is Dreamshaper 5. This particular LoRA is an age slider, which works by inputting a number that corresponds with the physical appearance of the subject. A “1” presents about middle age. A higher number is older and a lower/negative number is younger.

    Automatic1111's stable-diffusion-webui ControlNet extension area

    ControlNet, which employs different models focused on depth, shape, body poses, etc. to shape the output image’s composition, is an extension to Automatic1111’s stable-diffusion-webui that helps guide the generative AI model to produce an output image more closely aligned with what the user had in mind.

    For the Lynn Conway illustration, I used three different ControlNet units: depth (detecting what is closer and what is further away in an image), canny (one kind of edge detection for fine details), and lineart (another kind of edge detection for broader strokes). Giving each of these different levels of importance (control weight) and telling stable-diffusion-webui when to begin using a ControlNet (starting control step) and when to stop using a ControlNet (ending control step) during each image creation changes how the final image will look.

    Typically, each ControlNet unit uses an image as input for its guidance on the generative AI model. I used the GNU Image Manipulation Program (GIMP) to create a composite image with a photo of Lynn Conway on the right and a photo of her co-authored textbook on the left (see the screenshot at the top of this post). Thankfully, Charles Rogers added his photo of Conway to Wikipedia under a CC BY-SA 2.5 license, which gives others the right to remix the photo with credit to the original author, which I’ve done. Because the photo of Conway cropped her right arm, I rebuilt it using the clone tool in GIMP.

    I input the image that I made into the three ControlNets and through trial-and-error with each unit’s settings, A1111’s stable-diffusion-webui output an image that I was happy with and used on the post yesterday. I used a similar workflow to create the Jef Raskin illustration for this post, too.

  • Joan Slonczewski Added to Yet Another Science Fiction Textbook (YASFT)

    An image of a woman walking through a tunnel toward an ocean's beach and a sky filled with stars inspired by Joan Slonczewski's novel A Door Into Ocean. Created with Stable Diffusion.

    I added a whole new section on the Hard SF writer Joan Slonczewski (they/them/theirs) to the Feminist SF chapter of the OER Yet Another Science Fiction Textbook (YASFT). It gives students an overview of their background as a scientist, writer, and Quaker, and it discusses three representative novels from their oeuvre: A Door Into Ocean (1986), Brain Plague (2000), and The Highest Frontier (2011). Like the Afrofuturism chapter, I brought in more cited, critical analysis of Slonczewski’s writing, which is parenthetically cited with a full citation instead of using a works cited list or footnotes.

    Slonczewski’s A Door Into Ocean was the inspiration for the image above that I created using Stable Diffusion. It took the better part of a day to create the basic structure of the image, then there was inpainting of specific details such as the woman’s footprints in the sand, and finally, feeding the inpainted image back into SD’s controlnet to produce the final image.

  • First Anniversary of My Generative Artificial Intelligence (AI) and Pedagogy Bibliography and Resource List

    Artificial intelligence in a giant room of computers. Image generated with Stable Diffusion.

    Tomorrow is the first anniversary of the Generative Artificial Intelligence (AI) and Pedagogy Bibliography and Resource List.

    I first launched it on 13 April 2023 when I was directing the Professional and Technical Writing (PTW) Program at City Tech before going on my current research sabbatical.

    The motivation for the resource was two fold: I wanted to learn all that I could about generative AI for my professional work as a teacher and scholar, and I needed to understand the changes taking place due to these new technologies for the benefit of my students who had already expressed concern and wonder about it.

    I launched it with more than 150 MLA-formatted citations of books, collections, and articles related to AI and generative AI with an emphasis on teaching but also including useful background and area specific sources.

    Now, it has over 550 citations! It also includes a growing list of online resources with direct links!

    I’ll keep adding to it periodically, and if you have some sources that I haven’t included but should, drop me a line (my email address is in the sidebar to the right).

  • Lenovo ThinkPad P1 Gen 4 Powerhouse Workstation

    Lenovo ThinkPad P1 Gen 4 16" QHD+ i9-11950H✓64GB RAM✓2TB SSD✓RTX A5000 with screen open and showing Debian 12 desktop

    About halfway through my sabbatical, I needed to visit my parents in Georgia, but I also needed to continue working on my research projects. I didn’t feel safe about lugging my A6000 desktop computer (in checked baggage or shipping), so I followed my own advice and started looking for a used workstation-class laptop.

    It took a few weeks, but I landed this awesome, practically new Lenovo ThinkPad P1 Gen 4 from a seller on eBay. It has a 16″ QHD+ screen (that I scale down to 1080p for my eyes), an i9-11950H (8 core/16 thread) CPU, 64GB DDR4 RAM, 2TB SSD, and an NVIDIA RTX A5000 16GB discrete video card (Stable Diffusion and llamacpp worked without any hiccups).

    It plows through all of the work that I throw out at, but it does sound like a jet engine when its two cooling fans spin up. I have found that raising it off the desk by a couple of inches helps tremendously with cooling by increasing air flow. I had been using rigged up stands, but I built a special stand out of LEGO that I will show in detail tomorrow (but there’s a sneak peek in the photos below).

    I can’t sing this laptop’s praises loudly enough! It works well with Debian 12 Bookworm, but it does have some issues with power saving/hibernation, which is a known issue and might have some work around that I haven’t tried yet.

    The one thing that it can’t do without when doing GPU-focused work is it’s chonky 230 watt external power supply. I bring it with me when I know it will eat through its battery doing jobs. I recently upgraded my backpack to a Mystery Ranch 2-Day Assault Pack, which has a built-in sleeve that easily accommodates 16″ laptops like this one (but it can be tricky to use the laptop side egress slot due to the ThinkPad’s thickness).